Computer Vision¶
PART A - 20 Marks¶
- DOMAIN: Entertainment
- CONTEXT: Company X owns a movie application and repository which caters movie streaming to millions of users who on subscription basis.
Company wants to automate the process of cast and crew information in each scene from a movie such that when a user pauses on the movie
and clicks on cast information button, the app will show details of the actor in the scene. Company has an in-house computer vision and
multimedia experts who need to detect faces from screen shots from the movie scene.
The data labelling is already done.
- DATA DESCRIPTION: The dataset comprises of images and its mask for corresponding human face.
- PROJECT OBJECTIVE: To build a face detection system.
Steps and tasks: [ Total Score: 20 Marks]¶
1. Import and Understand the data [7 Marks]¶
- Import and read ‘images.npy’. [1 Marks]
- Split the data into Features(X) & labels(Y). Unify shape of all the images. [3 Marks]
Imp Note: Replace all the pixels within masked area with 1.
Hint: X will comprise of array of image whereas Y will comprise of coordinates of the mask(human face). Observe: data[0], data[0][0], data[0][1]. - Split the data into train and test[400:9]. [1 Marks]
- Select random image from the train data and display original image and masked image. [2 Marks]
Q. 1.A. Import and read ‘images.npy’.¶
In [ ]:
#Import required libraries
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import cv2
from IPython.display import clear_output
import zipfile
import tensorflow as tf
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.svm import SVC
from tqdm import tqdm
import warnings
In [ ]:
RunningInCOLAB = 'google.colab' in str(get_ipython()) if hasattr(__builtins__,'__IPYTHON__') else False
if RunningInCOLAB:
#We are running this on Google Colab
from google.colab import drive
drive.mount('/content/drive')
project_dir='/content/drive/MyDrive/Tuhin/AI-ML Course - UT Austin/Projects/8-Computer Vision' #Our project directory is mounted here
else:
#We are running in local machine
project_dir='.' #Our project directory is local directory
In [ ]:
# Load images and labels from the .npy file
file_path = f'{project_dir}/images.npy'
data = np.load(file_path, allow_pickle=True)
data.shape
Out[Â ]:
(409, 2)
Q. 1.B. Split the data into Features(X) & labels(Y). Unify shape of all the images.¶
In [ ]:
#First standardize the image shape
#We use MobileNetV2 for transfer learning. This model expects the input image to be of shape (224,224,3)
image_height = 224
image_width = 224
channels = 3
#Create X and Y sets
X = np.zeros((int(data.shape[0]),image_height, image_width, 3)) #Contains the original image (reshaped)
Y = np.zeros((int(data.shape[0]), image_height, image_width)) #Contains masks corresponding to the face co-ordinates
#Now populate the X and Y sets
no_of_images = len(data)
#Loop through the data to extract the face region and replace the pixel values with 1
errors = []
pbar=tqdm(range(no_of_images), ascii=True)
for i in pbar:
img = data[i][0] #Load original image array
img = cv2.resize(img, dsize=(image_height, image_width), interpolation=cv2.INTER_CUBIC) #Resize the image to 224x224
#If any image has only 1 channel (grayscale), then we will convert it to 3 channels (color)
#Also, we will use 3 channels only, so let's try to discard the alpha channel if it exists
try:
if len(img.shape) == 2: #Image is 2D array
errors.append(f"Found image {i} as Grayscale image, changed it 3 channel color image.\n")
# convert the grayscale image to color so that the number of channels are standardized to 3
img = cv2.cvtColor(img, cv2.COLOR_GRAY2RGB)
if img.shape[2] > 3: #Image has more than 3 channels, most likely alpha channel present
#Discard extra channels
img = img[:,:,:3]
except Exception as e:
#We discard images with are unusual shape
errors.append(f"Exception received: {e}. Discarded image {i}.\n")
continue
#Now populate the X and Y sets
X[i] = np.array(img, dtype=np.float32)
for mask in data[i][1]:
if 'Face' in mask['label']:
x1=int(mask['points'][0]['x'] * image_width)
y1=int(mask['points'][0]['y'] * image_height)
x2=int(mask['points'][1]['x'] * image_width)
y2=int(mask['points'][1]['y'] * image_height)
Y[i][y1:y2, x1:x2] = 1 # set all pixels within the mask co-ordinates to 1.
print()
print()
for error in errors:
print(error)
print(f"X and Y populated, shape of X is '{X.shape}' and the shape of Y is '{Y.shape}' ")
100%|##########| 409/409 [00:00<00:00, 882.33it/s]
Found image 272 as Grayscale image, changed it 3 channel color image. X and Y populated, shape of X is '(409, 224, 224, 3)' and the shape of Y is '(409, 224, 224)'
Q. 1.C. Split the data into train and test[400:9].¶
In [ ]:
#Split X and Y in train in test sets with 400:9 ratio
X_train = X[:400]
Y_train = Y[:400]
X_test = X[400:]
Y_test = Y[400:]
print(X_train.shape)
print(X_test.shape)
(400, 224, 224, 3) (9, 224, 224, 3)
Q. 1.D. Select random image from the train data and display original image and masked image.¶
In [ ]:
def show_image(index):
fig, axs = plt.subplots(1, 3, figsize=(20, 10))
axs[0].imshow((X_train[index]/255).astype(np.float32))
axs[0].set_title("Original Image")
axs[0].axis('off')
axs[1].imshow((X_train[index]/255).astype(np.float32))
axs[1].imshow(Y_train[index], alpha=0.5)
axs[1].set_title("Masked Area where Face is found")
axs[1].axis('off')
#Draw a countour around detected face in test_image from the predicted masked area
contours, _ = cv2.findContours(Y_train[index].astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
contoured_image = np.copy(X_train[index])
cv2.drawContours(contoured_image, contours, -1, (0, 255, 0), 1)
axs[2].imshow((contoured_image/255).astype(np.float32))
axs[2].set_title("Labeled Image(Colored Contour around Face)")
axs[2].axis('off')
plt.show()
#Select 4 random images from train data and display original image and masked image
indexes = np.random.randint(0, X_train.shape[0], size=4)
for index in indexes:
show_image(index)
2. Model building [11 Marks]¶
- Design a face mask detection model. [4 Marks]
Hint: 1. Use MobileNet architecture for initial pre-trained non-trainable layers.
Hint: 2. Add appropriate Upsampling layers to imitate U-net architecture. - Design your own Dice Coefficient and Loss function. [2 Marks]
- Train and tune the model as required. [3 Marks]
- Evaluate and share insights on performance of the model. [2 Marks]
Q. 2.A. Design a face mask detection model¶
In [ ]:
# Define the model
def model():
#Define the model
#We will use MobileNetV2 for transfer learning. This model expects the input image to be of shape (224,224,3)
#Input Image Layer
input = tf.keras.layers.Input([image_height, image_width, 3], dtype = tf.uint8, name="original_input_image")
#Preprocess the input image
x = tf.cast(input, tf.float32)
input_image_name = x.name.split('/')[0]
x = tf.keras.applications.mobilenet.preprocess_input(x)
#Load the MobileNetV2 model with the preprocessed input images
encoder = tf.keras.applications.MobileNetV2(input_tensor=x, input_shape=(image_height,image_width, 3), weights="imagenet", include_top=False, alpha=0.35)
#make encoder layer (including all sub-layers in it) non-trainable
encoder.trainable = False
encoder_output = encoder.get_layer("block_13_expand_relu").output
skip_connection_names = [input_image_name, "block_1_expand_relu", "block_3_expand_relu", "block_6_expand_relu"]
#Decoder
#Convolution filters for decoder: last layer will have 16 filters, previous layers will have 32, 48 and 64 filters
f = [16, 32, 48, 64]
#The bwlow 'x' will be the input to out decoder
x = encoder_output
# There are four repeatative layers corresponding to the skip connections:
### each layer will have 3 following sub-layers:
#### 1. One upsampling (doubleing the dimention) and concat it with out of corresponding encoder layer (in the order: 'input image', 'block 1 relu', 'block 3 relu' and 'block 6 relu' )
#### 2. And two sets of Convulution with 3x3 kernel , batch normalization with relu activation (no of filters will be decreasing in each layer in this order: 64, 48, 32, 16)
for i in range(1, len(skip_connection_names)+1, 1):
#Sub-layer 1
x_skip = encoder.get_layer(skip_connection_names[-i]).output
x = tf.keras.layers.UpSampling2D((2, 2))(x)
x = tf.keras.layers.Concatenate()([x, x_skip])
#Sub-layer 2
x = tf.keras.layers.Conv2D(f[-i], (3, 3), padding="same")(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Activation("relu")(x)
#Sub-layer 3
x = tf.keras.layers.Conv2D(f[-i], (3, 3), padding="same")(x)
x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Activation("relu")(x)
#Output with sigmoid activation - 0 or 1 output
x = tf.keras.layers.Conv2D(1, (1, 1), padding="same")(x)
output = tf.keras.layers.Activation("sigmoid")(x)
model = tf.keras.models.Model(inputs=[input], outputs=[output])
return model
In [ ]:
#Create the model
model = model()
model.summary()
WARNING:tensorflow:From C:\Users\tuhin.sengupta\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\keras\src\backend.py:1398: The name tf.executing_eagerly_outside_functions is deprecated. Please use tf.compat.v1.executing_eagerly_outside_functions instead.
WARNING:tensorflow:From C:\Users\tuhin.sengupta\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\keras\src\layers\normalization\batch_normalization.py:979: The name tf.nn.fused_batch_norm is deprecated. Please use tf.compat.v1.nn.fused_batch_norm instead.
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
original_input_image (Inpu [(None, 224, 224, 3)] 0 []
tLayer)
tf.cast (TFOpLambda) (None, 224, 224, 3) 0 ['original_input_image[0][0]']
tf.math.truediv (TFOpLambd (None, 224, 224, 3) 0 ['tf.cast[0][0]']
a)
tf.math.subtract (TFOpLamb (None, 224, 224, 3) 0 ['tf.math.truediv[0][0]']
da)
Conv1 (Conv2D) (None, 112, 112, 16) 432 ['tf.math.subtract[0][0]']
bn_Conv1 (BatchNormalizati (None, 112, 112, 16) 64 ['Conv1[0][0]']
on)
Conv1_relu (ReLU) (None, 112, 112, 16) 0 ['bn_Conv1[0][0]']
expanded_conv_depthwise (D (None, 112, 112, 16) 144 ['Conv1_relu[0][0]']
epthwiseConv2D)
expanded_conv_depthwise_BN (None, 112, 112, 16) 64 ['expanded_conv_depthwise[0][0
(BatchNormalization) ]']
expanded_conv_depthwise_re (None, 112, 112, 16) 0 ['expanded_conv_depthwise_BN[0
lu (ReLU) ][0]']
expanded_conv_project (Con (None, 112, 112, 8) 128 ['expanded_conv_depthwise_relu
v2D) [0][0]']
expanded_conv_project_BN ( (None, 112, 112, 8) 32 ['expanded_conv_project[0][0]'
BatchNormalization) ]
block_1_expand (Conv2D) (None, 112, 112, 48) 384 ['expanded_conv_project_BN[0][
0]']
block_1_expand_BN (BatchNo (None, 112, 112, 48) 192 ['block_1_expand[0][0]']
rmalization)
block_1_expand_relu (ReLU) (None, 112, 112, 48) 0 ['block_1_expand_BN[0][0]']
block_1_pad (ZeroPadding2D (None, 113, 113, 48) 0 ['block_1_expand_relu[0][0]']
)
block_1_depthwise (Depthwi (None, 56, 56, 48) 432 ['block_1_pad[0][0]']
seConv2D)
block_1_depthwise_BN (Batc (None, 56, 56, 48) 192 ['block_1_depthwise[0][0]']
hNormalization)
block_1_depthwise_relu (Re (None, 56, 56, 48) 0 ['block_1_depthwise_BN[0][0]']
LU)
block_1_project (Conv2D) (None, 56, 56, 8) 384 ['block_1_depthwise_relu[0][0]
']
block_1_project_BN (BatchN (None, 56, 56, 8) 32 ['block_1_project[0][0]']
ormalization)
block_2_expand (Conv2D) (None, 56, 56, 48) 384 ['block_1_project_BN[0][0]']
block_2_expand_BN (BatchNo (None, 56, 56, 48) 192 ['block_2_expand[0][0]']
rmalization)
block_2_expand_relu (ReLU) (None, 56, 56, 48) 0 ['block_2_expand_BN[0][0]']
block_2_depthwise (Depthwi (None, 56, 56, 48) 432 ['block_2_expand_relu[0][0]']
seConv2D)
block_2_depthwise_BN (Batc (None, 56, 56, 48) 192 ['block_2_depthwise[0][0]']
hNormalization)
block_2_depthwise_relu (Re (None, 56, 56, 48) 0 ['block_2_depthwise_BN[0][0]']
LU)
block_2_project (Conv2D) (None, 56, 56, 8) 384 ['block_2_depthwise_relu[0][0]
']
block_2_project_BN (BatchN (None, 56, 56, 8) 32 ['block_2_project[0][0]']
ormalization)
block_2_add (Add) (None, 56, 56, 8) 0 ['block_1_project_BN[0][0]',
'block_2_project_BN[0][0]']
block_3_expand (Conv2D) (None, 56, 56, 48) 384 ['block_2_add[0][0]']
block_3_expand_BN (BatchNo (None, 56, 56, 48) 192 ['block_3_expand[0][0]']
rmalization)
block_3_expand_relu (ReLU) (None, 56, 56, 48) 0 ['block_3_expand_BN[0][0]']
block_3_pad (ZeroPadding2D (None, 57, 57, 48) 0 ['block_3_expand_relu[0][0]']
)
block_3_depthwise (Depthwi (None, 28, 28, 48) 432 ['block_3_pad[0][0]']
seConv2D)
block_3_depthwise_BN (Batc (None, 28, 28, 48) 192 ['block_3_depthwise[0][0]']
hNormalization)
block_3_depthwise_relu (Re (None, 28, 28, 48) 0 ['block_3_depthwise_BN[0][0]']
LU)
block_3_project (Conv2D) (None, 28, 28, 16) 768 ['block_3_depthwise_relu[0][0]
']
block_3_project_BN (BatchN (None, 28, 28, 16) 64 ['block_3_project[0][0]']
ormalization)
block_4_expand (Conv2D) (None, 28, 28, 96) 1536 ['block_3_project_BN[0][0]']
block_4_expand_BN (BatchNo (None, 28, 28, 96) 384 ['block_4_expand[0][0]']
rmalization)
block_4_expand_relu (ReLU) (None, 28, 28, 96) 0 ['block_4_expand_BN[0][0]']
block_4_depthwise (Depthwi (None, 28, 28, 96) 864 ['block_4_expand_relu[0][0]']
seConv2D)
block_4_depthwise_BN (Batc (None, 28, 28, 96) 384 ['block_4_depthwise[0][0]']
hNormalization)
block_4_depthwise_relu (Re (None, 28, 28, 96) 0 ['block_4_depthwise_BN[0][0]']
LU)
block_4_project (Conv2D) (None, 28, 28, 16) 1536 ['block_4_depthwise_relu[0][0]
']
block_4_project_BN (BatchN (None, 28, 28, 16) 64 ['block_4_project[0][0]']
ormalization)
block_4_add (Add) (None, 28, 28, 16) 0 ['block_3_project_BN[0][0]',
'block_4_project_BN[0][0]']
block_5_expand (Conv2D) (None, 28, 28, 96) 1536 ['block_4_add[0][0]']
block_5_expand_BN (BatchNo (None, 28, 28, 96) 384 ['block_5_expand[0][0]']
rmalization)
block_5_expand_relu (ReLU) (None, 28, 28, 96) 0 ['block_5_expand_BN[0][0]']
block_5_depthwise (Depthwi (None, 28, 28, 96) 864 ['block_5_expand_relu[0][0]']
seConv2D)
block_5_depthwise_BN (Batc (None, 28, 28, 96) 384 ['block_5_depthwise[0][0]']
hNormalization)
block_5_depthwise_relu (Re (None, 28, 28, 96) 0 ['block_5_depthwise_BN[0][0]']
LU)
block_5_project (Conv2D) (None, 28, 28, 16) 1536 ['block_5_depthwise_relu[0][0]
']
block_5_project_BN (BatchN (None, 28, 28, 16) 64 ['block_5_project[0][0]']
ormalization)
block_5_add (Add) (None, 28, 28, 16) 0 ['block_4_add[0][0]',
'block_5_project_BN[0][0]']
block_6_expand (Conv2D) (None, 28, 28, 96) 1536 ['block_5_add[0][0]']
block_6_expand_BN (BatchNo (None, 28, 28, 96) 384 ['block_6_expand[0][0]']
rmalization)
block_6_expand_relu (ReLU) (None, 28, 28, 96) 0 ['block_6_expand_BN[0][0]']
block_6_pad (ZeroPadding2D (None, 29, 29, 96) 0 ['block_6_expand_relu[0][0]']
)
block_6_depthwise (Depthwi (None, 14, 14, 96) 864 ['block_6_pad[0][0]']
seConv2D)
block_6_depthwise_BN (Batc (None, 14, 14, 96) 384 ['block_6_depthwise[0][0]']
hNormalization)
block_6_depthwise_relu (Re (None, 14, 14, 96) 0 ['block_6_depthwise_BN[0][0]']
LU)
block_6_project (Conv2D) (None, 14, 14, 24) 2304 ['block_6_depthwise_relu[0][0]
']
block_6_project_BN (BatchN (None, 14, 14, 24) 96 ['block_6_project[0][0]']
ormalization)
block_7_expand (Conv2D) (None, 14, 14, 144) 3456 ['block_6_project_BN[0][0]']
block_7_expand_BN (BatchNo (None, 14, 14, 144) 576 ['block_7_expand[0][0]']
rmalization)
block_7_expand_relu (ReLU) (None, 14, 14, 144) 0 ['block_7_expand_BN[0][0]']
block_7_depthwise (Depthwi (None, 14, 14, 144) 1296 ['block_7_expand_relu[0][0]']
seConv2D)
block_7_depthwise_BN (Batc (None, 14, 14, 144) 576 ['block_7_depthwise[0][0]']
hNormalization)
block_7_depthwise_relu (Re (None, 14, 14, 144) 0 ['block_7_depthwise_BN[0][0]']
LU)
block_7_project (Conv2D) (None, 14, 14, 24) 3456 ['block_7_depthwise_relu[0][0]
']
block_7_project_BN (BatchN (None, 14, 14, 24) 96 ['block_7_project[0][0]']
ormalization)
block_7_add (Add) (None, 14, 14, 24) 0 ['block_6_project_BN[0][0]',
'block_7_project_BN[0][0]']
block_8_expand (Conv2D) (None, 14, 14, 144) 3456 ['block_7_add[0][0]']
block_8_expand_BN (BatchNo (None, 14, 14, 144) 576 ['block_8_expand[0][0]']
rmalization)
block_8_expand_relu (ReLU) (None, 14, 14, 144) 0 ['block_8_expand_BN[0][0]']
block_8_depthwise (Depthwi (None, 14, 14, 144) 1296 ['block_8_expand_relu[0][0]']
seConv2D)
block_8_depthwise_BN (Batc (None, 14, 14, 144) 576 ['block_8_depthwise[0][0]']
hNormalization)
block_8_depthwise_relu (Re (None, 14, 14, 144) 0 ['block_8_depthwise_BN[0][0]']
LU)
block_8_project (Conv2D) (None, 14, 14, 24) 3456 ['block_8_depthwise_relu[0][0]
']
block_8_project_BN (BatchN (None, 14, 14, 24) 96 ['block_8_project[0][0]']
ormalization)
block_8_add (Add) (None, 14, 14, 24) 0 ['block_7_add[0][0]',
'block_8_project_BN[0][0]']
block_9_expand (Conv2D) (None, 14, 14, 144) 3456 ['block_8_add[0][0]']
block_9_expand_BN (BatchNo (None, 14, 14, 144) 576 ['block_9_expand[0][0]']
rmalization)
block_9_expand_relu (ReLU) (None, 14, 14, 144) 0 ['block_9_expand_BN[0][0]']
block_9_depthwise (Depthwi (None, 14, 14, 144) 1296 ['block_9_expand_relu[0][0]']
seConv2D)
block_9_depthwise_BN (Batc (None, 14, 14, 144) 576 ['block_9_depthwise[0][0]']
hNormalization)
block_9_depthwise_relu (Re (None, 14, 14, 144) 0 ['block_9_depthwise_BN[0][0]']
LU)
block_9_project (Conv2D) (None, 14, 14, 24) 3456 ['block_9_depthwise_relu[0][0]
']
block_9_project_BN (BatchN (None, 14, 14, 24) 96 ['block_9_project[0][0]']
ormalization)
block_9_add (Add) (None, 14, 14, 24) 0 ['block_8_add[0][0]',
'block_9_project_BN[0][0]']
block_10_expand (Conv2D) (None, 14, 14, 144) 3456 ['block_9_add[0][0]']
block_10_expand_BN (BatchN (None, 14, 14, 144) 576 ['block_10_expand[0][0]']
ormalization)
block_10_expand_relu (ReLU (None, 14, 14, 144) 0 ['block_10_expand_BN[0][0]']
)
block_10_depthwise (Depthw (None, 14, 14, 144) 1296 ['block_10_expand_relu[0][0]']
iseConv2D)
block_10_depthwise_BN (Bat (None, 14, 14, 144) 576 ['block_10_depthwise[0][0]']
chNormalization)
block_10_depthwise_relu (R (None, 14, 14, 144) 0 ['block_10_depthwise_BN[0][0]'
eLU) ]
block_10_project (Conv2D) (None, 14, 14, 32) 4608 ['block_10_depthwise_relu[0][0
]']
block_10_project_BN (Batch (None, 14, 14, 32) 128 ['block_10_project[0][0]']
Normalization)
block_11_expand (Conv2D) (None, 14, 14, 192) 6144 ['block_10_project_BN[0][0]']
block_11_expand_BN (BatchN (None, 14, 14, 192) 768 ['block_11_expand[0][0]']
ormalization)
block_11_expand_relu (ReLU (None, 14, 14, 192) 0 ['block_11_expand_BN[0][0]']
)
block_11_depthwise (Depthw (None, 14, 14, 192) 1728 ['block_11_expand_relu[0][0]']
iseConv2D)
block_11_depthwise_BN (Bat (None, 14, 14, 192) 768 ['block_11_depthwise[0][0]']
chNormalization)
block_11_depthwise_relu (R (None, 14, 14, 192) 0 ['block_11_depthwise_BN[0][0]'
eLU) ]
block_11_project (Conv2D) (None, 14, 14, 32) 6144 ['block_11_depthwise_relu[0][0
]']
block_11_project_BN (Batch (None, 14, 14, 32) 128 ['block_11_project[0][0]']
Normalization)
block_11_add (Add) (None, 14, 14, 32) 0 ['block_10_project_BN[0][0]',
'block_11_project_BN[0][0]']
block_12_expand (Conv2D) (None, 14, 14, 192) 6144 ['block_11_add[0][0]']
block_12_expand_BN (BatchN (None, 14, 14, 192) 768 ['block_12_expand[0][0]']
ormalization)
block_12_expand_relu (ReLU (None, 14, 14, 192) 0 ['block_12_expand_BN[0][0]']
)
block_12_depthwise (Depthw (None, 14, 14, 192) 1728 ['block_12_expand_relu[0][0]']
iseConv2D)
block_12_depthwise_BN (Bat (None, 14, 14, 192) 768 ['block_12_depthwise[0][0]']
chNormalization)
block_12_depthwise_relu (R (None, 14, 14, 192) 0 ['block_12_depthwise_BN[0][0]'
eLU) ]
block_12_project (Conv2D) (None, 14, 14, 32) 6144 ['block_12_depthwise_relu[0][0
]']
block_12_project_BN (Batch (None, 14, 14, 32) 128 ['block_12_project[0][0]']
Normalization)
block_12_add (Add) (None, 14, 14, 32) 0 ['block_11_add[0][0]',
'block_12_project_BN[0][0]']
block_13_expand (Conv2D) (None, 14, 14, 192) 6144 ['block_12_add[0][0]']
block_13_expand_BN (BatchN (None, 14, 14, 192) 768 ['block_13_expand[0][0]']
ormalization)
block_13_expand_relu (ReLU (None, 14, 14, 192) 0 ['block_13_expand_BN[0][0]']
)
up_sampling2d (UpSampling2 (None, 28, 28, 192) 0 ['block_13_expand_relu[0][0]']
D)
concatenate (Concatenate) (None, 28, 28, 288) 0 ['up_sampling2d[0][0]',
'block_6_expand_relu[0][0]']
conv2d (Conv2D) (None, 28, 28, 64) 165952 ['concatenate[0][0]']
batch_normalization (Batch (None, 28, 28, 64) 256 ['conv2d[0][0]']
Normalization)
activation (Activation) (None, 28, 28, 64) 0 ['batch_normalization[0][0]']
conv2d_1 (Conv2D) (None, 28, 28, 64) 36928 ['activation[0][0]']
batch_normalization_1 (Bat (None, 28, 28, 64) 256 ['conv2d_1[0][0]']
chNormalization)
activation_1 (Activation) (None, 28, 28, 64) 0 ['batch_normalization_1[0][0]'
]
up_sampling2d_1 (UpSamplin (None, 56, 56, 64) 0 ['activation_1[0][0]']
g2D)
concatenate_1 (Concatenate (None, 56, 56, 112) 0 ['up_sampling2d_1[0][0]',
) 'block_3_expand_relu[0][0]']
conv2d_2 (Conv2D) (None, 56, 56, 48) 48432 ['concatenate_1[0][0]']
batch_normalization_2 (Bat (None, 56, 56, 48) 192 ['conv2d_2[0][0]']
chNormalization)
activation_2 (Activation) (None, 56, 56, 48) 0 ['batch_normalization_2[0][0]'
]
conv2d_3 (Conv2D) (None, 56, 56, 48) 20784 ['activation_2[0][0]']
batch_normalization_3 (Bat (None, 56, 56, 48) 192 ['conv2d_3[0][0]']
chNormalization)
activation_3 (Activation) (None, 56, 56, 48) 0 ['batch_normalization_3[0][0]'
]
up_sampling2d_2 (UpSamplin (None, 112, 112, 48) 0 ['activation_3[0][0]']
g2D)
concatenate_2 (Concatenate (None, 112, 112, 96) 0 ['up_sampling2d_2[0][0]',
) 'block_1_expand_relu[0][0]']
conv2d_4 (Conv2D) (None, 112, 112, 32) 27680 ['concatenate_2[0][0]']
batch_normalization_4 (Bat (None, 112, 112, 32) 128 ['conv2d_4[0][0]']
chNormalization)
activation_4 (Activation) (None, 112, 112, 32) 0 ['batch_normalization_4[0][0]'
]
conv2d_5 (Conv2D) (None, 112, 112, 32) 9248 ['activation_4[0][0]']
batch_normalization_5 (Bat (None, 112, 112, 32) 128 ['conv2d_5[0][0]']
chNormalization)
activation_5 (Activation) (None, 112, 112, 32) 0 ['batch_normalization_5[0][0]'
]
up_sampling2d_3 (UpSamplin (None, 224, 224, 32) 0 ['activation_5[0][0]']
g2D)
concatenate_3 (Concatenate (None, 224, 224, 35) 0 ['up_sampling2d_3[0][0]',
) 'tf.cast[0][0]']
conv2d_6 (Conv2D) (None, 224, 224, 16) 5056 ['concatenate_3[0][0]']
batch_normalization_6 (Bat (None, 224, 224, 16) 64 ['conv2d_6[0][0]']
chNormalization)
activation_6 (Activation) (None, 224, 224, 16) 0 ['batch_normalization_6[0][0]'
]
conv2d_7 (Conv2D) (None, 224, 224, 16) 2320 ['activation_6[0][0]']
batch_normalization_7 (Bat (None, 224, 224, 16) 64 ['conv2d_7[0][0]']
chNormalization)
activation_7 (Activation) (None, 224, 224, 16) 0 ['batch_normalization_7[0][0]'
]
conv2d_8 (Conv2D) (None, 224, 224, 1) 17 ['activation_7[0][0]']
activation_8 (Activation) (None, 224, 224, 1) 0 ['conv2d_8[0][0]']
==================================================================================================
Total params: 416209 (1.59 MB)
Trainable params: 317057 (1.21 MB)
Non-trainable params: 99152 (387.31 KB)
__________________________________________________________________________________________________
In [ ]:
#Show the model architecture
dot_img_file = f'{project_dir}/model.png' # model image file
tf.keras.utils.plot_model(model, to_file=dot_img_file, show_shapes=True, show_layer_activations=True, show_trainable=True)
fig, axs = plt.subplots(1, 1, figsize=(400, 200))
axs.imshow(plt.imread(dot_img_file))
axs.set_title('Model Architecture', fontsize=26, fontweight='bold', color='Green')
axs.axis('off')
plt.show()
Q. 2.B. Design your own Dice Coefficient and Loss function.¶
In [ ]:
smooth = tf.keras.backend.epsilon()
def dice_coef(y_true, y_pred):
y_true_flatten = tf.keras.backend.flatten(y_true)
y_pred_flatten = tf.keras.backend.flatten(y_pred)
intersection = tf.reduce_sum(y_true_flatten * y_pred_flatten)
union = tf.reduce_sum(y_true_flatten) + tf.reduce_sum(y_pred_flatten)
return (2 * intersection + smooth) / (union + smooth)
def dice_loss(y_true, y_pred):
return 1.0 - dice_coef(y_true, y_pred)
Q. 2.C. Train and tune the model as required.¶
In [ ]:
# Set some hyper parameters
epochs = 500 #@param {type:"integer"}
batch_size = 8 #@param {type:"integer"}
learning_rate = 1e-4 #@param {type:"number"}
In [ ]:
#Compile the model with optimizer and metrics
opt = tf.keras.optimizers.Nadam(learning_rate)
metrics = [dice_coef, tf.keras.metrics.Recall(name='recall'),tf.keras.metrics.Precision(name='precision')]
model.compile(loss=dice_loss, optimizer=opt, metrics=metrics)
In [ ]:
#Callbacks
callbacks = [
tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=5),
tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=False)
]
In [ ]:
#Fit the model
train_steps = len(X_train)//batch_size
test_steps = len(X_test)//batch_size
if len(X_train) % batch_size != 0:
train_steps += 1
if len(X_test) % batch_size != 0:
test_steps += 1
model_run=model.fit(
X_train, Y_train,
validation_data=(X_test, Y_test),
epochs=epochs,
steps_per_epoch=train_steps,
validation_steps=test_steps,
callbacks=callbacks
)
Epoch 1/500 50/50 [==============================] - 19s 57ms/step - loss: 0.7820 - dice_coef: 0.2180 - recall: 0.8853 - precision: 0.1276 - val_loss: 0.7699 - val_dice_coef: 0.2264 - val_recall: 0.9951 - val_precision: 0.1089 - lr: 1.0000e-04 Epoch 2/500 50/50 [==============================] - 1s 20ms/step - loss: 0.7325 - dice_coef: 0.2675 - recall: 0.9441 - precision: 0.1580 - val_loss: 0.7408 - val_dice_coef: 0.2550 - val_recall: 0.9923 - val_precision: 0.1541 - lr: 1.0000e-04 Epoch 3/500 50/50 [==============================] - 1s 20ms/step - loss: 0.6888 - dice_coef: 0.3112 - recall: 0.9137 - precision: 0.2413 - val_loss: 0.6853 - val_dice_coef: 0.3109 - val_recall: 0.9268 - val_precision: 0.3043 - lr: 1.0000e-04 Epoch 4/500 50/50 [==============================] - 1s 21ms/step - loss: 0.6504 - dice_coef: 0.3496 - recall: 0.9112 - precision: 0.3056 - val_loss: 0.6665 - val_dice_coef: 0.3302 - val_recall: 0.9649 - val_precision: 0.3186 - lr: 1.0000e-04 Epoch 5/500 50/50 [==============================] - 1s 20ms/step - loss: 0.6218 - dice_coef: 0.3782 - recall: 0.9053 - precision: 0.3698 - val_loss: 0.6422 - val_dice_coef: 0.3545 - val_recall: 0.9188 - val_precision: 0.4044 - lr: 1.0000e-04 Epoch 6/500 50/50 [==============================] - 1s 20ms/step - loss: 0.6006 - dice_coef: 0.3994 - recall: 0.9052 - precision: 0.4269 - val_loss: 0.6387 - val_dice_coef: 0.3577 - val_recall: 0.9276 - val_precision: 0.4071 - lr: 1.0000e-04 Epoch 7/500 50/50 [==============================] - 1s 21ms/step - loss: 0.5806 - dice_coef: 0.4194 - recall: 0.9065 - precision: 0.4744 - val_loss: 0.6299 - val_dice_coef: 0.3663 - val_recall: 0.8049 - val_precision: 0.5594 - lr: 1.0000e-04 Epoch 8/500 50/50 [==============================] - 1s 21ms/step - loss: 0.5631 - dice_coef: 0.4369 - recall: 0.9049 - precision: 0.5244 - val_loss: 0.6405 - val_dice_coef: 0.3558 - val_recall: 0.7109 - val_precision: 0.6115 - lr: 1.0000e-04 Epoch 9/500 50/50 [==============================] - 1s 19ms/step - loss: 0.5515 - dice_coef: 0.4485 - recall: 0.9050 - precision: 0.5453 - val_loss: 0.6116 - val_dice_coef: 0.3857 - val_recall: 0.8037 - val_precision: 0.5838 - lr: 1.0000e-04 Epoch 10/500 50/50 [==============================] - 1s 20ms/step - loss: 0.5429 - dice_coef: 0.4571 - recall: 0.9095 - precision: 0.5546 - val_loss: 0.6075 - val_dice_coef: 0.3905 - val_recall: 0.7536 - val_precision: 0.6145 - lr: 1.0000e-04 Epoch 11/500 50/50 [==============================] - 1s 20ms/step - loss: 0.5317 - dice_coef: 0.4683 - recall: 0.9067 - precision: 0.5778 - val_loss: 0.5942 - val_dice_coef: 0.4038 - val_recall: 0.7570 - val_precision: 0.6333 - lr: 1.0000e-04 Epoch 12/500 50/50 [==============================] - 1s 21ms/step - loss: 0.5201 - dice_coef: 0.4799 - recall: 0.9138 - precision: 0.5889 - val_loss: 0.5931 - val_dice_coef: 0.4058 - val_recall: 0.7249 - val_precision: 0.6142 - lr: 1.0000e-04 Epoch 13/500 50/50 [==============================] - 1s 21ms/step - loss: 0.5041 - dice_coef: 0.4959 - recall: 0.9207 - precision: 0.6126 - val_loss: 0.5744 - val_dice_coef: 0.4238 - val_recall: 0.8565 - val_precision: 0.5138 - lr: 1.0000e-04 Epoch 14/500 50/50 [==============================] - 1s 20ms/step - loss: 0.4965 - dice_coef: 0.5035 - recall: 0.9212 - precision: 0.6210 - val_loss: 0.5640 - val_dice_coef: 0.4339 - val_recall: 0.8516 - val_precision: 0.5299 - lr: 1.0000e-04 Epoch 15/500 50/50 [==============================] - 1s 20ms/step - loss: 0.4877 - dice_coef: 0.5123 - recall: 0.9185 - precision: 0.6313 - val_loss: 0.5657 - val_dice_coef: 0.4327 - val_recall: 0.7484 - val_precision: 0.6165 - lr: 1.0000e-04 Epoch 16/500 50/50 [==============================] - 1s 19ms/step - loss: 0.4788 - dice_coef: 0.5212 - recall: 0.9221 - precision: 0.6466 - val_loss: 0.5785 - val_dice_coef: 0.4184 - val_recall: 0.9561 - val_precision: 0.3922 - lr: 1.0000e-04 Epoch 17/500 50/50 [==============================] - 1s 19ms/step - loss: 0.4731 - dice_coef: 0.5269 - recall: 0.9217 - precision: 0.6486 - val_loss: 0.5685 - val_dice_coef: 0.4309 - val_recall: 0.6515 - val_precision: 0.6999 - lr: 1.0000e-04 Epoch 18/500 50/50 [==============================] - 1s 21ms/step - loss: 0.4570 - dice_coef: 0.5430 - recall: 0.9283 - precision: 0.6679 - val_loss: 0.5509 - val_dice_coef: 0.4485 - val_recall: 0.7003 - val_precision: 0.6648 - lr: 1.0000e-04 Epoch 19/500 50/50 [==============================] - 1s 20ms/step - loss: 0.4493 - dice_coef: 0.5507 - recall: 0.9295 - precision: 0.6785 - val_loss: 0.5301 - val_dice_coef: 0.4699 - val_recall: 0.7645 - val_precision: 0.6211 - lr: 1.0000e-04 Epoch 20/500 50/50 [==============================] - 1s 20ms/step - loss: 0.4380 - dice_coef: 0.5620 - recall: 0.9327 - precision: 0.6875 - val_loss: 0.5406 - val_dice_coef: 0.4591 - val_recall: 0.6751 - val_precision: 0.6795 - lr: 1.0000e-04 Epoch 21/500 50/50 [==============================] - 1s 20ms/step - loss: 0.4268 - dice_coef: 0.5732 - recall: 0.9349 - precision: 0.7013 - val_loss: 0.5159 - val_dice_coef: 0.4825 - val_recall: 0.7980 - val_precision: 0.5839 - lr: 1.0000e-04 Epoch 22/500 50/50 [==============================] - 1s 20ms/step - loss: 0.4179 - dice_coef: 0.5821 - recall: 0.9337 - precision: 0.7128 - val_loss: 0.5016 - val_dice_coef: 0.4971 - val_recall: 0.8048 - val_precision: 0.5998 - lr: 1.0000e-04 Epoch 23/500 50/50 [==============================] - 1s 20ms/step - loss: 0.4083 - dice_coef: 0.5917 - recall: 0.9372 - precision: 0.7189 - val_loss: 0.5048 - val_dice_coef: 0.4939 - val_recall: 0.7758 - val_precision: 0.6212 - lr: 1.0000e-04 Epoch 24/500 50/50 [==============================] - 1s 20ms/step - loss: 0.4058 - dice_coef: 0.5942 - recall: 0.9322 - precision: 0.7274 - val_loss: 0.5133 - val_dice_coef: 0.4866 - val_recall: 0.6767 - val_precision: 0.6767 - lr: 1.0000e-04 Epoch 25/500 50/50 [==============================] - 1s 20ms/step - loss: 0.3934 - dice_coef: 0.6066 - recall: 0.9351 - precision: 0.7389 - val_loss: 0.4873 - val_dice_coef: 0.5112 - val_recall: 0.7896 - val_precision: 0.6171 - lr: 1.0000e-04 Epoch 26/500 50/50 [==============================] - 1s 20ms/step - loss: 0.3847 - dice_coef: 0.6153 - recall: 0.9361 - precision: 0.7480 - val_loss: 0.5243 - val_dice_coef: 0.4751 - val_recall: 0.6374 - val_precision: 0.6799 - lr: 1.0000e-04 Epoch 27/500 50/50 [==============================] - 1s 20ms/step - loss: 0.3728 - dice_coef: 0.6272 - recall: 0.9367 - precision: 0.7612 - val_loss: 0.4811 - val_dice_coef: 0.5169 - val_recall: 0.7809 - val_precision: 0.6139 - lr: 1.0000e-04 Epoch 28/500 50/50 [==============================] - 1s 20ms/step - loss: 0.3630 - dice_coef: 0.6370 - recall: 0.9383 - precision: 0.7698 - val_loss: 0.4823 - val_dice_coef: 0.5162 - val_recall: 0.7812 - val_precision: 0.5973 - lr: 1.0000e-04 Epoch 29/500 50/50 [==============================] - 1s 21ms/step - loss: 0.3598 - dice_coef: 0.6402 - recall: 0.9408 - precision: 0.7695 - val_loss: 0.5064 - val_dice_coef: 0.4920 - val_recall: 0.6138 - val_precision: 0.7433 - lr: 1.0000e-04 Epoch 30/500 50/50 [==============================] - 1s 20ms/step - loss: 0.3576 - dice_coef: 0.6424 - recall: 0.9261 - precision: 0.7778 - val_loss: 0.4874 - val_dice_coef: 0.5109 - val_recall: 0.6978 - val_precision: 0.6687 - lr: 1.0000e-04 Epoch 31/500 50/50 [==============================] - 1s 20ms/step - loss: 0.3346 - dice_coef: 0.6654 - recall: 0.9397 - precision: 0.7948 - val_loss: 0.4822 - val_dice_coef: 0.5158 - val_recall: 0.6634 - val_precision: 0.7138 - lr: 1.0000e-04 Epoch 32/500 50/50 [==============================] - 1s 20ms/step - loss: 0.3307 - dice_coef: 0.6693 - recall: 0.9393 - precision: 0.7998 - val_loss: 0.4728 - val_dice_coef: 0.5271 - val_recall: 0.6897 - val_precision: 0.6768 - lr: 1.0000e-04 Epoch 33/500 50/50 [==============================] - 1s 20ms/step - loss: 0.3197 - dice_coef: 0.6803 - recall: 0.9383 - precision: 0.8086 - val_loss: 0.4671 - val_dice_coef: 0.5304 - val_recall: 0.8481 - val_precision: 0.5338 - lr: 1.0000e-04 Epoch 34/500 50/50 [==============================] - 1s 21ms/step - loss: 0.3268 - dice_coef: 0.6732 - recall: 0.9318 - precision: 0.7973 - val_loss: 0.4596 - val_dice_coef: 0.5423 - val_recall: 0.6921 - val_precision: 0.6674 - lr: 1.0000e-04 Epoch 35/500 50/50 [==============================] - 1s 19ms/step - loss: 0.3027 - dice_coef: 0.6973 - recall: 0.9358 - precision: 0.8244 - val_loss: 0.4839 - val_dice_coef: 0.5139 - val_recall: 0.8790 - val_precision: 0.4874 - lr: 1.0000e-04 Epoch 36/500 50/50 [==============================] - 1s 20ms/step - loss: 0.2945 - dice_coef: 0.7055 - recall: 0.9390 - precision: 0.8226 - val_loss: 0.4577 - val_dice_coef: 0.5437 - val_recall: 0.6511 - val_precision: 0.6927 - lr: 1.0000e-04 Epoch 37/500 50/50 [==============================] - 1s 20ms/step - loss: 0.2836 - dice_coef: 0.7164 - recall: 0.9429 - precision: 0.8344 - val_loss: 0.4731 - val_dice_coef: 0.5269 - val_recall: 0.5915 - val_precision: 0.7409 - lr: 1.0000e-04 Epoch 38/500 50/50 [==============================] - 1s 20ms/step - loss: 0.2719 - dice_coef: 0.7281 - recall: 0.9440 - precision: 0.8430 - val_loss: 0.4424 - val_dice_coef: 0.5574 - val_recall: 0.7289 - val_precision: 0.6340 - lr: 1.0000e-04 Epoch 39/500 50/50 [==============================] - 1s 20ms/step - loss: 0.2699 - dice_coef: 0.7301 - recall: 0.9417 - precision: 0.8458 - val_loss: 0.4689 - val_dice_coef: 0.5292 - val_recall: 0.6017 - val_precision: 0.7138 - lr: 1.0000e-04 Epoch 40/500 50/50 [==============================] - 1s 19ms/step - loss: 0.2577 - dice_coef: 0.7423 - recall: 0.9410 - precision: 0.8565 - val_loss: 0.4200 - val_dice_coef: 0.5781 - val_recall: 0.8198 - val_precision: 0.5906 - lr: 1.0000e-04 Epoch 41/500 50/50 [==============================] - 1s 19ms/step - loss: 0.2544 - dice_coef: 0.7456 - recall: 0.9414 - precision: 0.8572 - val_loss: 0.4248 - val_dice_coef: 0.5736 - val_recall: 0.7023 - val_precision: 0.6755 - lr: 1.0000e-04 Epoch 42/500 50/50 [==============================] - 1s 20ms/step - loss: 0.2464 - dice_coef: 0.7536 - recall: 0.9415 - precision: 0.8621 - val_loss: 0.4057 - val_dice_coef: 0.5923 - val_recall: 0.7633 - val_precision: 0.6516 - lr: 1.0000e-04 Epoch 43/500 50/50 [==============================] - 1s 21ms/step - loss: 0.2407 - dice_coef: 0.7593 - recall: 0.9415 - precision: 0.8645 - val_loss: 0.4225 - val_dice_coef: 0.5775 - val_recall: 0.6366 - val_precision: 0.7495 - lr: 1.0000e-04 Epoch 44/500 50/50 [==============================] - 1s 19ms/step - loss: 0.2283 - dice_coef: 0.7717 - recall: 0.9454 - precision: 0.8741 - val_loss: 0.4261 - val_dice_coef: 0.5765 - val_recall: 0.6419 - val_precision: 0.6906 - lr: 1.0000e-04 Epoch 45/500 50/50 [==============================] - 1s 20ms/step - loss: 0.2300 - dice_coef: 0.7700 - recall: 0.9406 - precision: 0.8740 - val_loss: 0.4184 - val_dice_coef: 0.5809 - val_recall: 0.6528 - val_precision: 0.7229 - lr: 1.0000e-04 Epoch 46/500 50/50 [==============================] - 1s 20ms/step - loss: 0.2330 - dice_coef: 0.7670 - recall: 0.9298 - precision: 0.8673 - val_loss: 0.3987 - val_dice_coef: 0.6016 - val_recall: 0.7357 - val_precision: 0.6609 - lr: 1.0000e-04 Epoch 47/500 50/50 [==============================] - 1s 19ms/step - loss: 0.2227 - dice_coef: 0.7773 - recall: 0.9331 - precision: 0.8824 - val_loss: 0.4015 - val_dice_coef: 0.5977 - val_recall: 0.6761 - val_precision: 0.7075 - lr: 1.0000e-04 Epoch 48/500 50/50 [==============================] - 1s 20ms/step - loss: 0.2127 - dice_coef: 0.7873 - recall: 0.9414 - precision: 0.8842 - val_loss: 0.4072 - val_dice_coef: 0.5934 - val_recall: 0.6838 - val_precision: 0.6743 - lr: 1.0000e-04 Epoch 49/500 50/50 [==============================] - 1s 20ms/step - loss: 0.2059 - dice_coef: 0.7941 - recall: 0.9403 - precision: 0.8877 - val_loss: 0.4066 - val_dice_coef: 0.5931 - val_recall: 0.6579 - val_precision: 0.6941 - lr: 1.0000e-04 Epoch 50/500 50/50 [==============================] - 1s 20ms/step - loss: 0.2004 - dice_coef: 0.7996 - recall: 0.9384 - precision: 0.8924 - val_loss: 0.4348 - val_dice_coef: 0.5663 - val_recall: 0.5707 - val_precision: 0.7380 - lr: 1.0000e-04 Epoch 51/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1953 - dice_coef: 0.8047 - recall: 0.9401 - precision: 0.8963 - val_loss: 0.4066 - val_dice_coef: 0.5930 - val_recall: 0.6714 - val_precision: 0.6657 - lr: 1.0000e-04 Epoch 52/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1899 - dice_coef: 0.8101 - recall: 0.9363 - precision: 0.9045 - val_loss: 0.3922 - val_dice_coef: 0.6079 - val_recall: 0.6951 - val_precision: 0.6732 - lr: 1.0000e-05 Epoch 53/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1804 - dice_coef: 0.8196 - recall: 0.9485 - precision: 0.9078 - val_loss: 0.3989 - val_dice_coef: 0.6013 - val_recall: 0.6558 - val_precision: 0.7066 - lr: 1.0000e-05 Epoch 54/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1793 - dice_coef: 0.8207 - recall: 0.9532 - precision: 0.9088 - val_loss: 0.3963 - val_dice_coef: 0.6037 - val_recall: 0.6699 - val_precision: 0.6968 - lr: 1.0000e-05 Epoch 55/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1850 - dice_coef: 0.8150 - recall: 0.9430 - precision: 0.9093 - val_loss: 0.3948 - val_dice_coef: 0.6049 - val_recall: 0.6942 - val_precision: 0.6734 - lr: 1.0000e-05 Epoch 56/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1779 - dice_coef: 0.8221 - recall: 0.9512 - precision: 0.9121 - val_loss: 0.3919 - val_dice_coef: 0.6075 - val_recall: 0.6973 - val_precision: 0.6765 - lr: 1.0000e-05 Epoch 57/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1803 - dice_coef: 0.8197 - recall: 0.9482 - precision: 0.9122 - val_loss: 0.3924 - val_dice_coef: 0.6071 - val_recall: 0.6867 - val_precision: 0.6854 - lr: 1.0000e-05 Epoch 58/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1806 - dice_coef: 0.8194 - recall: 0.9526 - precision: 0.9114 - val_loss: 0.3892 - val_dice_coef: 0.6101 - val_recall: 0.6956 - val_precision: 0.6825 - lr: 1.0000e-05 Epoch 59/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1787 - dice_coef: 0.8213 - recall: 0.9481 - precision: 0.9139 - val_loss: 0.3837 - val_dice_coef: 0.6156 - val_recall: 0.7381 - val_precision: 0.6515 - lr: 1.0000e-05 Epoch 60/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1755 - dice_coef: 0.8245 - recall: 0.9530 - precision: 0.9118 - val_loss: 0.3988 - val_dice_coef: 0.6007 - val_recall: 0.6669 - val_precision: 0.6941 - lr: 1.0000e-05 Epoch 61/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1826 - dice_coef: 0.8174 - recall: 0.9443 - precision: 0.9124 - val_loss: 0.3945 - val_dice_coef: 0.6050 - val_recall: 0.6794 - val_precision: 0.6872 - lr: 1.0000e-05 Epoch 62/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1845 - dice_coef: 0.8155 - recall: 0.9423 - precision: 0.9133 - val_loss: 0.3870 - val_dice_coef: 0.6127 - val_recall: 0.7097 - val_precision: 0.6668 - lr: 1.0000e-05 Epoch 63/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1765 - dice_coef: 0.8235 - recall: 0.9482 - precision: 0.9134 - val_loss: 0.3867 - val_dice_coef: 0.6130 - val_recall: 0.7140 - val_precision: 0.6623 - lr: 1.0000e-05 Epoch 64/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1748 - dice_coef: 0.8252 - recall: 0.9508 - precision: 0.9148 - val_loss: 0.3920 - val_dice_coef: 0.6077 - val_recall: 0.6696 - val_precision: 0.7039 - lr: 1.0000e-05 Epoch 65/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1778 - dice_coef: 0.8222 - recall: 0.9433 - precision: 0.9159 - val_loss: 0.3917 - val_dice_coef: 0.6079 - val_recall: 0.6779 - val_precision: 0.6934 - lr: 1.0000e-06 Epoch 66/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1791 - dice_coef: 0.8209 - recall: 0.9468 - precision: 0.9154 - val_loss: 0.3915 - val_dice_coef: 0.6081 - val_recall: 0.6850 - val_precision: 0.6858 - lr: 1.0000e-06 Epoch 67/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1775 - dice_coef: 0.8225 - recall: 0.9483 - precision: 0.9165 - val_loss: 0.3901 - val_dice_coef: 0.6095 - val_recall: 0.6928 - val_precision: 0.6798 - lr: 1.0000e-06 Epoch 68/500 50/50 [==============================] - 1s 20ms/step - loss: 0.1709 - dice_coef: 0.8291 - recall: 0.9528 - precision: 0.9176 - val_loss: 0.3916 - val_dice_coef: 0.6081 - val_recall: 0.6880 - val_precision: 0.6815 - lr: 1.0000e-06 Epoch 69/500 50/50 [==============================] - 1s 19ms/step - loss: 0.1745 - dice_coef: 0.8255 - recall: 0.9515 - precision: 0.9165 - val_loss: 0.3909 - val_dice_coef: 0.6087 - val_recall: 0.6927 - val_precision: 0.6776 - lr: 1.0000e-06
Q. 2.D. Evaluate and share insights on performance of the model.¶
In [ ]:
metrics=model.evaluate(X_test, Y_test, steps = test_steps)
print("Model Performance Metrics:")
print(f" Dice Loss = {metrics[0]}")
print(f" Dice Coefficient = {metrics[1]}")
print(f" Recall = {metrics[2]}")
print(f" Precision = {metrics[3]}")
2/2 [==============================] - 0s 13ms/step - loss: 0.3909 - dice_coef: 0.6087 - recall: 0.6927 - precision: 0.6776 Model Performance Metrics: Dice Loss = 0.3909347653388977 Dice Coefficient = 0.6086981296539307 Recall = 0.6926884651184082 Precision = 0.677649736404419
In [ ]:
#Show the training vs validation loss
plt.figure(figsize=(10, 5))
plt.plot(model_run.history['loss'], label='Training Loss')
plt.plot(model_run.history['val_loss'], label='Validation Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend()
plt.show()
#Show the training vs validation dice coef over the epochs
plt.plot(model_run.history['dice_coef'], label='Training Dice Coef')
plt.plot(model_run.history['val_dice_coef'], label='Validation Dice Coef')
plt.plot(model_run.history['recall'], label='Training Recall')
plt.plot(model_run.history['val_recall'], label='Validation Recall')
plt.plot(model_run.history['precision'], label='Training Precision')
plt.plot(model_run.history['val_precision'], label='Validation Precision')
plt.title('Model Metrics')
plt.ylabel('Metrics')
plt.xlabel('Epoch')
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0.)
plt.show()
Insight of the performance of the model¶
- The model has been trained on 400 images and tested on 9 images.
- The model has been trained on 500 epochs with batch size of 8, but training ran finished early, as loss is getting flattened out.
- During the run, training loss declined steadily, but validation loss also declined before it stabilized and flattened out.
- Training recall reached a steady value quickly, but validation recall declined over period.
- But precision of training and validation improved over the epochs, however validation precision had lot of ups and down at end.
- Dice coeefficient for both training and testing imporved together and then flattened out.
- Validation metrics have some more up and down, and stabilizes further from corresonding training metrics, this means there is some amount of overfitting, most likely due to imbalanced dataset.
3. Test the model predictions on the test image: ‘image with index 3 in the test data’ and visualise the predicted masks on the faces in the image.[2 Marks]¶
In [ ]:
def show_original_vs_predicted_face_area(i):
fig, axs = plt.subplots(1, 3, figsize=(20, 10))
test_image = np.copy(X_test[i])
#Draw a countour around detected face in test_image from the predicted masked area
contours, _ = cv2.findContours(Y_test[i].astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cv2.drawContours(test_image, contours, -1, (0, 255, 0), 1)
axs[0].imshow((test_image/255).astype(np.float32))
axs[0].set_title("Original Image with Labelled Face Contour")
axs[0].axis('off')
test_image = np.copy(X_test[i])
Y_pred = model.predict(np.array([test_image]))
pred_mask = cv2.resize((1.0*(Y_pred[0] > 0.5)), (image_width,image_height))
axs[1].imshow((test_image/255).astype(np.float32))
axs[1].imshow(pred_mask, alpha=0.5)
axs[1].set_title("Original Image with Predicted Mask")
axs[1].axis('off')
#Draw a countour around detected face in test_image from the predicted masked area
contours, _ = cv2.findContours(pred_mask.astype(np.uint8), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cv2.drawContours(test_image, contours, -1, (0, 255, 0), 1)
axs[2].imshow((test_image/255).astype(np.float32))
axs[2].set_title("Predicted Face Contour in Original Image")
axs[2].axis('off')
plt.show()
#Show third image
show_original_vs_predicted_face_area(3)
1/1 [==============================] - 1s 1s/step
In [ ]:
del data
del X
del Y
del model
PART B - 10 Marks¶
- DOMAIN: Entertainment
- CONTEXT: Company X owns a movie application and repository which caters movie streaming to millions of users who on subscription basis.
Company wants to automate the process of cast and crew information in each scene from a movie such that when a user pauses on the movie
and clicks on cast information button, the app will show details of the actor in the scene. Company has an in-house computer vision and
multimedia experts who need to detect faces from screen shots from the movie scene.
The data labelling is already done.
- DATA DESCRIPTION: The dataset comprises of images and its mask for corresponding human face.
- PROJECT OBJECTIVE: To create an image dataset to be used by AI team build an image classi fier data. Profile images of people are given.
Steps and tasks: [ Total Score: 10 Marks]¶
1. Read/import images from folder ‘training_images’. [2 Marks]¶
In [ ]:
#Import images from folder 'training_images'
#Unzip the file training_images.zip into session
training_images_zip = f'{project_dir}/training_images.zip' # Google Drive path
#We will extract to local folder, loading image from google drive is time consuming
training_images_extract_folder = 'training_images'
with zipfile.ZipFile(training_images_zip, 'r') as zip_ref:
zip_ref.extractall(training_images_extract_folder)
training_images = os.listdir(training_images_extract_folder+'/training_images')
print(training_images)
['real_00909.jpg', 'real_00902.jpg', 'real_00554.jpg', 'real_00791.jpg', 'real_00331.jpg', 'real_00231.jpg', 'real_00799.jpg', 'real_00939.jpg', 'real_00704.jpg', 'real_00738.jpg', 'real_00917.jpg', 'real_00545.jpg', 'real_00329.jpg', 'real_00427.jpg', 'real_00782.jpg', 'real_00698.jpg', 'real_01024.jpg', 'real_00881.jpg', 'real_01048.jpg', 'real_00425.jpg', 'real_00658.jpg', 'real_00074.jpg', 'real_00729.jpg', 'real_00736.jpg', 'real_00751.jpg', 'real_00235.jpg', 'real_00775.jpg', 'real_00896.jpg', 'real_00821.jpg', 'real_00945.jpg', 'real_00092.jpg', 'real_01056.jpg', 'real_00864.jpg', 'real_00402.jpg', 'real_00461.jpg', 'real_00793.jpg', 'real_00348.jpg', 'real_00994.jpg', 'real_00410.jpg', 'real_00889.jpg', 'real_00954.jpg', 'real_00028.jpg', 'real_00670.jpg', 'real_01036.jpg', 'real_00723.jpg', 'real_00257.jpg', 'real_00464.jpg', 'real_00959.jpg', 'real_00505.jpg', 'real_00383.jpg', 'real_00166.jpg', 'real_00955.jpg', 'real_00002.jpg', 'real_00039.jpg', 'real_00256.jpg', 'real_00138.jpg', 'real_00023.jpg', 'real_00711.jpg', 'real_00201.jpg', 'real_00301.jpg', 'real_01023.jpg', 'real_00172.jpg', 'real_00806.jpg', 'real_00865.jpg', 'real_00178.jpg', 'real_00236.jpg', 'real_00066.jpg', 'real_00111.jpg', 'real_00148.jpg', 'real_00013.jpg', 'real_00869.jpg', 'real_00260.jpg', 'real_00707.jpg', 'real_00495(1).jpg', 'real_00129.jpg', 'real_00456(1).jpg', 'real_00336.jpg', 'real_00904.jpg', 'real_01013(1).jpg', 'real_00664.jpg', 'real_00577.jpg', 'real_00777.jpg', 'real_01080.jpg', 'real_00040.jpg', 'real_00291.jpg', 'real_00911.jpg', 'real_00499.jpg', 'real_00217.jpg', 'real_00411.jpg', 'real_00318.jpg', 'real_00673.jpg', 'real_00355.jpg', 'real_00549.jpg', 'real_00519.jpg', 'real_00105.jpg', 'real_00214.jpg', 'real_00328.jpg', 'real_00449.jpg', 'real_00393.jpg', 'real_00931.jpg', 'real_01007.jpg', 'real_00632.jpg', 'real_00219.jpg', 'real_00119.jpg', 'real_00508.jpg', 'real_00044.jpg', 'real_00556.jpg', 'real_00717.jpg', 'real_00361.jpg', 'real_00880.jpg', 'real_00657.jpg', 'real_00031.jpg', 'real_00384.jpg', 'real_00892.jpg', 'real_00326.jpg', 'real_00980.jpg', 'real_00081.jpg', 'real_00024.jpg', 'real_00394.jpg', 'real_00147.jpg', 'real_01049.jpg', 'real_00851.jpg', 'real_00797.jpg', 'real_00551.jpg', 'real_00522.jpg', 'real_01042.jpg', 'real_00804.jpg', 'real_00920.jpg', 'real_00807.jpg', 'real_00716.jpg', 'real_00215.jpg', 'real_01077.jpg', 'real_00362.jpg', 'real_00870.jpg', 'real_00432.jpg', 'real_00321.jpg', 'real_00453.jpg', 'real_00874.jpg', 'real_00772.jpg', 'real_00444.jpg', 'real_00014.jpg', 'real_00831.jpg', 'real_01006.jpg', 'real_00211.jpg', 'real_00295.jpg', 'real_00687.jpg', 'real_00467.jpg', 'real_00679.jpg', 'real_00011.jpg', 'real_00852.jpg', 'real_00812.jpg', 'real_00745.jpg', 'real_00357.jpg', 'real_00996.jpg', 'real_00993.jpg', 'real_00573.jpg', 'real_00808.jpg', 'real_00715.jpg', 'real_00964.jpg', 'real_01020.jpg', 'real_00076.jpg', 'real_00929.jpg', 'real_00860.jpg', 'real_00607.jpg', 'real_00447.jpg', 'real_00709.jpg', 'real_00740.jpg', 'real_00207.jpg', 'real_00803.jpg', 'real_00828.jpg', 'real_00137.jpg', 'real_00756.jpg', 'real_00218.jpg', 'real_01072.jpg', 'real_00682.jpg', 'real_01007(1).jpg', 'real_00697.jpg', 'real_00466.jpg', 'real_00319.jpg', 'real_00077.jpg', 'real_00045.jpg', 'real_00834.jpg', 'real_01079.jpg', 'real_00788.jpg', 'real_00975.jpg', 'real_00722.jpg', 'real_00903.jpg', 'real_00990.jpg', 'real_00546.jpg', 'real_00816.jpg', 'real_00885.jpg', 'real_00368.jpg', 'real_00489.jpg', 'real_00940.jpg', 'real_00346.jpg', 'real_00746.jpg', 'real_00462.jpg', 'real_00748.jpg', 'real_00398.jpg', 'real_00158.jpg', 'real_00364.jpg', 'real_00344.jpg', 'real_00186.jpg', 'real_00051.jpg', 'real_00591.jpg', 'real_00049.jpg', 'real_00876.jpg', 'real_00083.jpg', 'real_00616.jpg', 'real_00908.jpg', 'real_00294.jpg', 'real_00270.jpg', 'real_00677.jpg', 'real_00306.jpg', 'real_00275.jpg', 'real_00474.jpg', 'real_00194.jpg', 'real_00223.jpg', 'real_00146.jpg', 'real_00774.jpg', 'real_00574.jpg', 'real_00790.jpg', 'real_00030.jpg', 'real_00010.jpg', 'real_00179.jpg', 'real_00518.jpg', 'real_00555.jpg', 'real_00144.jpg', 'real_00987.jpg', 'real_00324.jpg', 'real_00767.jpg', 'real_00794.jpg', 'real_00719.jpg', 'real_00407.jpg', 'real_00022.jpg', 'real_00798.jpg', 'real_00643.jpg', 'real_00353.jpg', 'real_00103.jpg', 'real_00629.jpg', 'real_00845.jpg', 'real_00552.jpg', 'real_00233.jpg', 'real_00570.jpg', 'real_00303.jpg', 'real_00325.jpg', 'real_00334.jpg', 'real_00695.jpg', 'real_00196.jpg', 'real_00358.jpg', 'real_00502.jpg', 'real_00769.jpg', 'real_00565.jpg', 'real_00819.jpg', 'real_00435.jpg', 'real_00027.jpg', 'real_00162.jpg', 'real_00484.jpg', 'real_00879.jpg', 'real_00636.jpg', 'real_00108.jpg', 'real_00592.jpg', 'real_00020.jpg', 'real_00877.jpg', 'real_00571.jpg', 'real_00265.jpg', 'real_01019.jpg', 'real_00921.jpg', 'real_00493.jpg', 'real_00240.jpg', 'real_00206.jpg', 'real_00575.jpg', 'real_00498.jpg', 'real_00229.jpg', 'real_00854.jpg', 'real_01011.jpg', 'real_01028.jpg', 'real_00611.jpg', 'real_00568.jpg', 'real_00009.jpg', 'real_00390.jpg', 'real_00926.jpg', 'real_00150.jpg', 'real_00712.jpg', 'real_01010.jpg', 'real_00392.jpg', 'real_00056.jpg', 'real_01001.jpg', 'real_00780.jpg', 'real_00681.jpg', 'real_00504.jpg', 'real_00972.jpg', 'real_00836.jpg', 'real_00339.jpg', 'real_00477.jpg', 'real_00380.jpg', 'real_00366.jpg', 'real_00109.jpg', 'real_00824.jpg', 'real_01026.jpg', 'real_00127.jpg', 'real_00833.jpg', 'real_00050.jpg', 'real_00417.jpg', 'real_00531.jpg', 'real_00781.jpg', 'real_00037.jpg', 'real_00365.jpg', 'real_00292.jpg', 'real_00274.jpg', 'real_00268.jpg', 'real_00406.jpg', 'real_00700.jpg', 'real_00614.jpg', 'real_00192.jpg', 'real_00038.jpg', 'real_01002.jpg', 'real_01043.jpg', 'real_00783.jpg', 'real_00973.jpg', 'real_00006.jpg', 'real_00692.jpg', 'real_01000.jpg', 'real_00733.jpg', 'real_00742.jpg', 'real_00284.jpg', 'real_00967.jpg', 'real_00200.jpg', 'real_00949.jpg', 'real_00763.jpg', 'real_00131.jpg', 'real_00533.jpg', 'real_00458.jpg', 'real_00279.jpg', 'real_00101.jpg', 'real_00285.jpg', 'real_00676.jpg', 'real_00503.jpg', 'real_00252.jpg', 'real_00448.jpg', 'real_00169.jpg', 'real_00912.jpg', 'real_00669.jpg', 'real_00946.jpg', 'real_00653.jpg', 'real_00625.jpg', 'real_00846.jpg', 'real_00785.jpg', 'real_00085.jpg', 'real_00705.jpg', 'real_00512.jpg', 'real_00122.jpg', 'real_00287.jpg', 'real_00576.jpg', 'real_00026.jpg', 'real_00335.jpg', 'real_00094.jpg', 'real_00609.jpg', 'real_01022.jpg', 'real_00550.jpg', 'real_00413.jpg', 'real_01029.jpg', 'real_00248.jpg', 'real_00438.jpg', 'real_00434.jpg', 'real_00509.jpg', 'real_00382.jpg', 'real_00893.jpg', 'real_00021.jpg', 'real_00922.jpg', 'real_01061.jpg', 'real_00978.jpg', 'real_00141.jpg', 'real_00792.jpg', 'real_00572.jpg', 'real_00720.jpg', 'real_00276.jpg', 'real_00180.jpg', 'real_00936.jpg', 'real_00426.jpg', 'real_00457.jpg', 'real_00237.jpg', 'real_00177.jpg', 'real_00241.jpg', 'real_00998.jpg', 'real_00494.jpg', 'real_00861.jpg', 'real_00517.jpg', 'real_00170.jpg', 'real_00759.jpg', 'real_00097.jpg', 'real_00182.jpg', 'real_01071.jpg', 'real_00606.jpg', 'real_00622.jpg', 'real_01008.jpg', 'real_00005.jpg', 'real_00482.jpg', 'real_00765.jpg', 'real_00159.jpg', 'real_00970.jpg', 'real_00096.jpg', 'real_00025.jpg', 'real_00913.jpg', 'real_00684.jpg', 'real_00209.jpg', 'real_00528.jpg', 'real_00069.jpg', 'real_00638.jpg', 'real_00547.jpg', 'real_00848.jpg', 'real_00624.jpg', 'real_00149.jpg', 'real_00641.jpg', 'real_00495.jpg', 'real_00347.jpg', 'real_00152.jpg', 'real_00841.jpg', 'real_00634.jpg', 'real_00164.jpg', 'real_00008.jpg', 'real_00823.jpg', 'real_00645.jpg', 'real_00468.jpg', 'real_01070.jpg', 'real_00296.jpg', 'real_00566.jpg', 'real_00866.jpg', 'real_00429.jpg', 'real_00093.jpg', 'real_00983.jpg', 'real_00472.jpg', 'real_01074.jpg', 'real_00091.jpg', 'real_00351.jpg', 'real_00915.jpg', 'real_00541.jpg', 'real_00254.jpg', 'real_00371.jpg', 'real_00046.jpg', 'real_01039.jpg', 'real_00408.jpg', 'real_00176.jpg', 'real_00156.jpg', 'real_00204.jpg', 'real_01067.jpg', 'real_00847.jpg', 'real_00354.jpg', 'real_00124.jpg', 'real_00405.jpg', 'real_00891.jpg', 'real_00059.jpg', 'real_00126.jpg', 'real_00599.jpg', 'real_00523.jpg', 'real_00372.jpg', 'real_00259.jpg', 'real_00421.jpg', 'real_00244.jpg', 'real_00732.jpg', 'real_00151.jpg', 'real_00007.jpg', 'real_01050.jpg', 'real_00630.jpg', 'real_01069.jpg', 'real_00694.jpg', 'real_00850.jpg', 'real_01046.jpg', 'real_00649.jpg', 'real_00060.jpg', 'real_00991.jpg', 'real_01038.jpg', 'real_00691.jpg', 'real_00317.jpg', 'real_00961.jpg', 'real_01016.jpg', 'real_00588.jpg', 'real_00578.jpg', 'real_00130.jpg', 'real_00480.jpg', 'real_00173.jpg', 'real_00500.jpg', 'real_00243.jpg', 'real_00232.jpg', 'real_00982.jpg', 'real_01062.jpg', 'real_00419.jpg', 'real_00988.jpg', 'real_00208.jpg', 'real_00603.jpg', 'real_00737.jpg', 'real_01015.jpg', 'real_00431.jpg', 'real_00277.jpg', 'real_00374.jpg', 'real_00642.jpg', 'real_00950.jpg', 'real_00730.jpg', 'real_00778.jpg', 'real_00433.jpg', 'real_00473.jpg', 'real_00175.jpg', 'real_00661.jpg', 'real_00157.jpg', 'real_01063.jpg', 'real_00668.jpg', 'real_00666.jpg', 'real_00114.jpg', 'real_00539.jpg', 'real_01041.jpg', 'real_00672.jpg', 'real_01032.jpg', 'real_00320.jpg', 'real_00612.jpg', 'real_00644.jpg', 'real_00553.jpg', 'real_00548.jpg', 'real_00145.jpg', 'real_00734.jpg', 'real_00623.jpg', 'real_00514.jpg', 'real_01066.jpg', 'real_00212.jpg', 'real_00930.jpg', 'real_01058.jpg', 'real_00397.jpg', 'real_01012.jpg', 'real_00969.jpg', 'real_00640.jpg', 'real_00281.jpg', 'real_00916.jpg', 'real_00647.jpg', 'real_00450.jpg', 'real_00646.jpg', 'real_00349.jpg', 'real_00721.jpg', 'real_00974.jpg', 'real_00238.jpg', 'real_00389.jpg', 'real_00899.jpg', 'real_00796.jpg', 'real_00250.jpg', 'real_00239.jpg', 'real_00396.jpg', 'real_00017.jpg', 'real_00795.jpg', 'real_00567.jpg', 'real_00401.jpg', 'real_00115.jpg', 'real_00976.jpg', 'real_00341.jpg', 'real_00938.jpg', 'real_00139.jpg', 'real_00084.jpg', 'real_00516.jpg', 'real_00266.jpg', 'real_00222.jpg', 'real_00948.jpg', 'real_00167.jpg', 'real_00480(1).jpg', 'real_00827.jpg', 'real_01064.jpg', 'real_00524.jpg', 'real_00886.jpg', 'real_00884.jpg', 'real_00826.jpg', 'real_00378.jpg', 'real_00506.jpg', 'real_00078.jpg', 'real_00762.jpg', 'real_00258.jpg', 'real_00569.jpg', 'real_01033.jpg', 'real_01053.jpg', 'real_01037.jpg', 'real_00934.jpg', 'real_00849.jpg', 'real_00142.jpg', 'real_00054.jpg', 'real_00999.jpg', 'real_00962.jpg', 'real_00436.jpg', 'real_00475.jpg', 'real_00273.jpg', 'real_00771.jpg', 'real_00907.jpg', 'real_00651.jpg', 'real_01078.jpg', 'real_00596.jpg', 'real_00278.jpg', 'real_00202.jpg', 'real_00581.jpg', 'real_00264.jpg', 'real_00839.jpg', 'real_00663.jpg', 'real_00605.jpg', 'real_00128.jpg', 'real_00766.jpg', 'real_00968.jpg', 'real_00089.jpg', 'real_00773.jpg', 'real_00613.jpg', 'real_00314.jpg', 'real_01044.jpg', 'real_00584.jpg', 'real_00674.jpg', 'real_00724.jpg', 'real_01031.jpg', 'real_00280.jpg', 'real_00120.jpg', 'real_00617.jpg', 'real_00758.jpg', 'real_00727.jpg', 'real_00656.jpg', 'real_00905.jpg', 'real_00492.jpg', 'real_00080.jpg', 'real_00255.jpg', 'real_00185.jpg', 'real_00198.jpg', 'real_00520.jpg', 'real_00589.jpg', 'real_00853.jpg', 'real_00626.jpg', 'real_01004.jpg', 'real_00327.jpg', 'real_00898.jpg', 'real_00894.jpg', 'real_00210.jpg', 'real_00561.jpg', 'real_01034.jpg', 'real_00187.jpg', 'real_00928.jpg', 'real_00437.jpg', 'real_00319(1).jpg', 'real_00286.jpg', 'real_00090.jpg', 'real_00501.jpg', 'real_00544.jpg', 'real_00810.jpg', 'real_00070.jpg', 'real_00830.jpg', 'real_00639.jpg', 'real_00901.jpg', 'real_00052.jpg', 'real_00835.jpg', 'real_00486.jpg', 'real_00154.jpg', 'real_00862.jpg', 'real_00113.jpg', 'real_01035.jpg', 'real_00189.jpg', 'real_00479.jpg', 'real_01081.jpg', 'real_00481.jpg', 'real_00822.jpg', 'real_00311.jpg', 'real_00106.jpg', 'real_01040.jpg', 'real_00225.jpg', 'real_00191.jpg', 'real_00536.jpg', 'real_00224.jpg', 'real_00267.jpg', 'real_00633.jpg', 'real_00307.jpg', 'real_00033.jpg', 'real_00387.jpg', 'real_00136.jpg', 'real_00895.jpg', 'real_00369.jpg', 'real_00057.jpg', 'real_00242.jpg', 'real_00685.jpg', 'real_00073.jpg', 'real_00701.jpg', 'real_00873.jpg', 'real_00688.jpg', 'real_00747.jpg', 'real_00003.jpg', 'real_00985.jpg', 'real_00829.jpg', 'real_00925.jpg', 'real_00883.jpg', 'real_01065.jpg', 'real_00316.jpg', 'real_00494(1).jpg', 'real_00318(1).jpg', 'real_00227.jpg', 'real_00867.jpg', 'real_00197.jpg', 'real_00805.jpg', 'real_00247.jpg', 'real_00391.jpg', 'real_00460.jpg', 'real_00814.jpg', 'real_00809.jpg', 'real_00953.jpg', 'real_00171.jpg', 'real_00859.jpg', 'real_00750.jpg', 'real_00521.jpg', 'real_00764.jpg', 'real_01057.jpg', 'real_01018.jpg', 'real_00800.jpg', 'real_00562.jpg', 'real_00015.jpg', 'real_00245.jpg', 'real_00302.jpg', 'real_00290.jpg', 'real_01027.jpg', 'real_01005.jpg', 'real_00871.jpg', 'real_00585.jpg', 'real_00205.jpg', 'real_00288.jpg', 'real_00992.jpg', 'real_00047.jpg', 'real_00627.jpg', 'real_01051.jpg', 'real_00813.jpg', 'real_00594.jpg', 'real_00689.jpg', 'real_00485.jpg', 'real_00370.jpg', 'real_00415.jpg', 'real_00399.jpg', 'real_00675.jpg', 'real_00403.jpg', 'real_00308.jpg', 'real_00534.jpg', 'real_00143.jpg', 'real_00261.jpg', 'real_00269.jpg', 'real_00667.jpg', 'real_00337.jpg', 'real_00352.jpg', 'real_00251.jpg', 'real_00933.jpg', 'real_00309.jpg', 'real_01076.jpg', 'real_00542.jpg', 'real_00029.jpg', 'real_01075.jpg', 'real_00530.jpg', 'real_01047.jpg', 'real_00395.jpg', 'real_01014.jpg', 'real_00966.jpg', 'real_00787.jpg', 'real_00443.jpg', 'real_00373.jpg', 'real_00965.jpg', 'real_00487.jpg', 'real_00490.jpg', 'real_00515.jpg', 'real_00367.jpg', 'real_00112.jpg', 'real_00579.jpg', 'real_00481(1).jpg', 'real_00298.jpg', 'real_00543.jpg', 'real_00424.jpg', 'real_00818.jpg', 'real_00602.jpg', 'real_00706.jpg', 'real_00497.jpg', 'real_00906.jpg', 'real_00564.jpg', 'real_00927.jpg', 'real_00249.jpg', 'real_00923.jpg', 'real_00067.jpg', 'real_00412.jpg', 'real_00741.jpg', 'real_00442.jpg', 'real_00511.jpg', 'real_00088.jpg', 'real_00728.jpg', 'real_00385.jpg', 'real_01052.jpg', 'real_00868.jpg', 'real_00160.jpg', 'real_00253.jpg', 'real_00135.jpg', 'real_00838.jpg', 'real_00761.jpg', 'real_00956.jpg', 'real_00155.jpg', 'real_00455.jpg', 'real_01055.jpg', 'real_00343.jpg', 'real_00718.jpg', 'real_00476.jpg', 'real_00753.jpg', 'real_00463.jpg', 'real_00696.jpg', 'real_00858.jpg', 'real_00342.jpg', 'real_00098.jpg', 'real_00001.jpg', 'real_00619.jpg', 'real_00582.jpg', 'real_00282.jpg', 'real_00470.jpg', 'real_00981.jpg', 'real_00271.jpg', 'real_00315.jpg', 'real_00857.jpg', 'real_00610.jpg', 'real_00739.jpg', 'real_00310.jpg', 'real_00123.jpg', 'real_00036.jpg', 'real_00731.jpg', 'real_00293.jpg', 'real_00381.jpg', 'real_00193.jpg', 'real_00216.jpg', 'real_00375.jpg', 'real_00079.jpg', 'real_00061.jpg', 'real_00446.jpg', 'real_00064.jpg', 'real_00532.jpg', 'real_00957.jpg', 'real_00323.jpg', 'real_00190.jpg', 'real_00035.jpg', 'real_00063.jpg', 'real_00815.jpg', 'real_01068.jpg', 'real_00786.jpg', 'real_01054.jpg', 'real_00947.jpg', 'real_00770.jpg', 'real_00272.jpg', 'real_00465.jpg', 'real_00332.jpg', 'real_00779.jpg', 'real_00188.jpg', 'real_01059.jpg', 'real_00914.jpg', 'real_00784.jpg', 'real_00440.jpg', 'real_00559.jpg', 'real_00043.jpg', 'real_00478.jpg', 'real_00075.jpg', 'real_00621.jpg', 'real_00471.jpg', 'real_00856.jpg', 'real_00620.jpg', 'real_00363.jpg', 'real_00439.jpg', 'real_00586.jpg', 'real_00825.jpg', 'real_00359.jpg', 'real_00163.jpg', 'real_00997.jpg', 'real_00713.jpg', 'real_00635.jpg', 'real_00161.jpg', 'real_00510.jpg', 'real_00058.jpg', 'real_00960.jpg', 'real_00540.jpg', 'real_00637.jpg', 'real_00560.jpg', 'real_00757.jpg', 'real_00655.jpg', 'real_00693.jpg', 'real_00752.jpg', 'real_00018.jpg', 'real_00246.jpg', 'real_00703.jpg', 'real_00652.jpg', 'real_00910.jpg', 'real_00107.jpg', 'real_00042.jpg', 'real_00305.jpg', 'real_00376.jpg', 'real_00725.jpg', 'real_00659.jpg', 'real_00331(1).jpg', 'real_00019.jpg', 'real_01045.jpg', 'real_00459.jpg', 'real_00110.jpg', 'real_00820.jpg', 'real_00984.jpg', 'real_00422.jpg', 'real_00971.jpg', 'real_00420.jpg', 'real_00174.jpg', 'real_00213.jpg', 'real_00496.jpg', 'real_00388.jpg', 'real_00863.jpg', 'real_00377.jpg', 'real_00483.jpg', 'real_00117.jpg', 'real_00414.jpg', 'real_00714.jpg', 'real_00995.jpg', 'real_00776.jpg', 'real_00118.jpg', 'real_00183.jpg', 'real_00872.jpg', 'real_00755.jpg', 'real_00116.jpg', 'real_00593.jpg', 'real_00072.jpg', 'real_00300.jpg', 'real_00360.jpg', 'real_00943.jpg', 'real_00842.jpg', 'real_00469.jpg', 'real_00488.jpg', 'real_00662.jpg', 'real_00338.jpg', 'real_00958.jpg', 'real_00648.jpg', 'real_00062.jpg', 'real_00583.jpg', 'real_00963.jpg', 'real_00230.jpg', 'real_00832.jpg', 'real_00897.jpg', 'real_00743.jpg', 'real_00855.jpg', 'real_00134.jpg', 'real_00428.jpg', 'real_00843.jpg', 'real_00416.jpg', 'real_00650.jpg', 'real_00195.jpg', 'real_00491.jpg', 'real_00598.jpg', 'real_00618.jpg', 'real_00086.jpg', 'real_00333.jpg', 'real_00989.jpg', 'real_00181.jpg', 'real_00322.jpg', 'real_00199.jpg', 'real_00012.jpg', 'real_00203.jpg', 'real_00710.jpg', 'real_00386.jpg', 'real_00133.jpg', 'real_00289.jpg', 'real_00283.jpg', 'real_01017.jpg', 'real_00350.jpg', 'real_00071.jpg', 'real_00527.jpg', 'real_00297.jpg', 'real_00600.jpg', 'real_00708.jpg', 'real_00631.jpg', 'real_01021.jpg', 'real_01030.jpg', 'real_00430.jpg', 'real_00601.jpg', 'real_00882.jpg', 'real_00099.jpg', 'real_00423.jpg', 'real_00537.jpg', 'real_00802.jpg', 'real_00234.jpg', 'real_00016.jpg', 'real_00628.jpg', 'real_01013.jpg', 'real_00932.jpg', 'real_00441.jpg', 'real_00590.jpg', 'real_00507.jpg', 'real_00890.jpg', 'real_00919.jpg', 'real_00937.jpg', 'real_00538.jpg', 'real_00312.jpg', 'real_00837.jpg', 'real_00986.jpg', 'real_00801.jpg', 'real_00535.jpg', 'real_00165.jpg', 'real_00087.jpg', 'real_00454.jpg', 'real_00660.jpg', 'real_00680.jpg', 'real_00032.jpg', 'real_00034.jpg', 'real_00340.jpg', 'real_00102.jpg', 'real_00345.jpg', 'real_00595.jpg', 'real_00100.jpg', 'real_00608.jpg', 'real_00690.jpg', 'real_00744.jpg', 'real_00678.jpg', 'real_00888.jpg', 'real_00918.jpg', 'real_00615.jpg', 'real_00941.jpg', 'real_00456.jpg', 'real_00977.jpg', 'real_00944.jpg', 'real_00068.jpg', 'real_00299.jpg', 'real_00140.jpg', 'real_00451.jpg', 'real_00526.jpg', 'real_00683.jpg', 'real_00654.jpg', 'real_00041.jpg', 'real_00220.jpg', 'real_00226.jpg', 'real_00878.jpg', 'real_00262.jpg', 'real_00754.jpg', 'real_00055.jpg', 'real_00125.jpg', 'real_00597.jpg', 'real_00951.jpg', 'real_00095.jpg', 'real_00580.jpg', 'real_00887.jpg', 'real_00665.jpg', 'real_00699.jpg', 'real_00811.jpg', 'real_00330.jpg', 'real_00686.jpg', 'real_00104.jpg', 'real_01060.jpg', 'real_00900.jpg', 'real_00132.jpg', 'real_00121.jpg', 'real_00557.jpg', 'real_00048.jpg', 'real_00558.jpg', 'real_00942.jpg', 'real_00604.jpg', 'real_00221.jpg', 'real_01025.jpg', 'real_00760.jpg', 'real_00529.jpg', 'real_00671.jpg', 'real_00844.jpg', 'real_00768.jpg', 'real_00875.jpg', 'real_00409.jpg', 'real_00749.jpg', 'real_00263.jpg', 'real_00053.jpg', 'real_00313.jpg', 'real_00168.jpg', 'real_00563.jpg', 'real_00004.jpg', 'real_00513.jpg', 'real_00735.jpg', 'real_00525.jpg', 'real_00082.jpg', 'real_00587.jpg', 'real_00840.jpg', 'real_00935.jpg', 'real_00789.jpg', 'real_00153.jpg', 'real_00379.jpg', 'real_00228.jpg', 'real_01009.jpg', 'real_00817.jpg', 'real_00702.jpg', 'real_00452.jpg', 'real_01003.jpg', 'real_00184.jpg', 'real_00404.jpg', 'real_00065.jpg', 'real_00979.jpg', 'real_00304.jpg', 'real_00356.jpg', 'real_00418.jpg', 'real_00726.jpg', 'real_00445.jpg', 'real_00924.jpg', 'real_00400.jpg', 'real_01073.jpg', 'real_00952.jpg']
2. Write a loop which will iterate through all the images in the ‘training_images’ folder and detect the faces present on all the images. [3 Marks ]¶
Hint: You can use ’haarcascade_frontalface_default.xml’ from internet to detect faces which is available open source.
In [ ]:
#Iterate through all images in the folder and predict the face area
#use 'haarcascade_frontalface_default.xml' from internet to detect faces which is available open source.
face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
pbar=tqdm(training_images, ascii=True)
for img in pbar:
pbar.set_description(img)
original_image = cv2.imread(f'{training_images_extract_folder}/training_images/{img}')
#use face_cascade to detect face area
#Covert to grayscale and equalize the histogram of the grayscale image.
gray = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
gray = cv2.equalizeHist(gray)
faces = face_cascade.detectMultiScale(gray)
pbar.set_postfix({'num_faces': len(faces)})
real_00952.jpg: 100%|##########| 1091/1091 [01:08<00:00, 15.89it/s, num_faces=1]
3. From the same loop above, extract metadata of the faces and write into a DataFrame. [3 Marks]¶
Sample output:
| x | y | w | h | Total_Faces | Image_Name | |
|---|---|---|---|---|---|---|
| 0 | 94 | 144 | 390 | 390 | 1 | real_00251.jpg |
| 1 | 65 | 87 | 459 | 459 | 1 | real_00537.jpg |
In [ ]:
#Extract the face area from the original image as above and save it in a Dataframe with columns 'x', 'y', 'w', 'h', 'Total_Faces' and 'Image_Name'
face_data = []
for img in tqdm(training_images, ascii=True):
original_image = cv2.imread(f'{training_images_extract_folder}/training_images/{img}')
gray = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
gray = cv2.equalizeHist(gray)
faces = face_cascade.detectMultiScale(gray)
total_faces = len(faces)
if(total_faces==0):
face_data.append([0,0,0,0,0, img]) #No faces found
for (x, y, w, h) in faces:
face_data.append([x,y,w,h,total_faces, img])
face_data = pd.DataFrame(face_data, columns=['x','y','w','h','Total_Faces', 'Image_Name'])
print()
print(face_data)
100%|##########| 1091/1091 [01:06<00:00, 16.50it/s]
x y w h Total_Faces Image_Name
0 73 120 427 427 1 real_00909.jpg
1 62 79 462 462 1 real_00902.jpg
2 96 156 405 405 1 real_00554.jpg
3 90 76 413 413 1 real_00791.jpg
4 122 130 425 425 1 real_00331.jpg
... ... ... ... ... ... ...
1228 30 9 512 512 1 real_00924.jpg
1229 68 53 456 456 1 real_00400.jpg
1230 2 52 56 56 2 real_01073.jpg
1231 86 139 408 408 2 real_01073.jpg
1232 58 14 486 486 1 real_00952.jpg
[1233 rows x 6 columns]
4. Save the output Dataframe in .csv format. [2 Marks]¶
In [ ]:
#Save the dataframe to a csv file
face_data_csv = f'{project_dir}//face_data.csv' # Google Drive path
face_data.to_csv(face_data_csv, index=False)
del face_data
PART C - 30 Marks¶
- DOMAIN: Face Recognition
- CONTEXT: Company X intends to build a face identification model to recognise human faces.
- DATA DESCRIPTION: The dataset comprises of images and its mask where there is a human face.
- PROJECT OBJECTIVE: Face Aligned Face Dataset from Pinterest. This dataset contains 10,770 images for 100 people. All images are taken from 'Pinterest' and aligned using dlib library. Some data samples:
Steps and tasks: [ Total Score: 30 Marks]¶
1. Unzip, read and Load data(‘PINS.zip’) into session. [2 Marks]¶
In [ ]:
#Unzip the file PINS.zip into session
pins_zip = f'{project_dir}/PINS.zip' # Google Drive path
#We will extract to local folder, loading image from google drive is time consuming
pins_extract_folder = 'PINS'
with zipfile.ZipFile(pins_zip, 'r') as zip_ref:
zip_ref.extractall(pins_extract_folder)
2. Write function to create metadata of the image. [4 Marks]¶
Hint: Metadata means derived information from the available data which can be useful for particular problem statement.
In [ ]:
#Write function to create metatdata for the images
#[This is copied from the Hint notebook shared - 'Hint - CV - 2_Part 2_.ipynb']
#Class to create metadata for the image file
# base - base directory of the dataset
# name - identity name
# file - image file name
# person_name : name of the person, after pins_ in the folder name
class IdentityMetadata():
def __init__(self, base, name, file):
# print(base, name, file)
# dataset base directory
self.base = base
# identity name
self.name = name
# image file name
self.file = file
#Person name : this is done by removing prefix pins_ from the file name
self.person_name = name.split('_')[1]
def __repr__(self):
return self.image_path()
def image_path(self):
return os.path.join(self.base, self.name, self.file)
3. Write a loop to iterate through each and every image and create metadata for all the images. [4 Marks]¶
In [ ]:
#Function to load metadata
#This loops through the dataset and creates metadata for each image
#path - path of the dataset
def load_metadata(path):
metadata = []
for i in os.listdir(path):
for f in os.listdir(os.path.join(path, i)):
# Check file extension. Allow only jpg/jpeg' files.
ext = os.path.splitext(f)[1]
if ext == '.jpg' or ext == '.jpeg':
metadata.append(IdentityMetadata(path, i, f))
return np.array(metadata)
#Now load the metadata for the images from 'PINS/PINS' folder (the PINS.zip folder unzipped into 'PINS' folder, which contents another PINS folder with images)
pins_folder = f'{pins_extract_folder}/PINS'
metadata = load_metadata(pins_folder)
#This function reads the image data from the given image path
def load_image(path):
img = cv2.imread(path, 1)
# OpenCV loads images with color channels
# in BGR order. So we need to reverse them
return img[...,::-1]
In [ ]:
# Print and display the first image to check
img=load_image(metadata[0].image_path())
print(img)
plt.imshow(img)
plt.show()
[[[67 56 50] [62 51 45] [56 45 39] ... [22 19 12] [22 19 12] [23 18 12]] [[65 54 48] [64 53 47] [56 45 39] ... [22 19 12] [22 19 12] [23 18 12]] [[61 50 46] [59 48 44] [47 36 30] ... [23 18 12] [23 18 12] [23 18 12]] ... [[60 40 41] [60 40 41] [56 36 37] ... [30 21 16] [31 22 17] [31 22 17]] [[60 40 41] [60 40 41] [57 37 38] ... [30 21 16] [31 22 17] [31 22 17]] [[61 41 42] [60 40 41] [57 37 38] ... [30 21 16] [31 22 17] [31 22 17]]]
4. Generate Embeddings vectors on the each face in the dataset. [4 Marks]¶
Hint: Use ‘vgg_face_weights.h5’
In [ ]:
#Define the VGG model
def vgg_face():
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.ZeroPadding2D((1,1),input_shape=(224,224, 3)))
model.add(tf.keras.layers.Convolution2D(64, (3, 3), activation='relu'))
model.add(tf.keras.layers.ZeroPadding2D((1,1)))
model.add(tf.keras.layers.Convolution2D(64, (3, 3), activation='relu'))
model.add(tf.keras.layers.MaxPooling2D((2,2), strides=(2,2)))
model.add(tf.keras.layers.ZeroPadding2D((1,1)))
model.add(tf.keras.layers.Convolution2D(128, (3, 3), activation='relu'))
model.add(tf.keras.layers.ZeroPadding2D((1,1)))
model.add(tf.keras.layers.Convolution2D(128, (3, 3), activation='relu'))
model.add(tf.keras.layers.MaxPooling2D((2,2), strides=(2,2)))
model.add(tf.keras.layers.ZeroPadding2D((1,1)))
model.add(tf.keras.layers.Convolution2D(256, (3, 3), activation='relu'))
model.add(tf.keras.layers.ZeroPadding2D((1,1)))
model.add(tf.keras.layers.Convolution2D(256, (3, 3), activation='relu'))
model.add(tf.keras.layers.ZeroPadding2D((1,1)))
model.add(tf.keras.layers.Convolution2D(256, (3, 3), activation='relu'))
model.add(tf.keras.layers.MaxPooling2D((2,2), strides=(2,2)))
model.add(tf.keras.layers.ZeroPadding2D((1,1)))
model.add(tf.keras.layers.Convolution2D(512, (3, 3), activation='relu'))
model.add(tf.keras.layers.ZeroPadding2D((1,1)))
model.add(tf.keras.layers.Convolution2D(512, (3, 3), activation='relu'))
model.add(tf.keras.layers.ZeroPadding2D((1,1)))
model.add(tf.keras.layers.Convolution2D(512, (3, 3), activation='relu'))
model.add(tf.keras.layers.MaxPooling2D((2,2), strides=(2,2)))
model.add(tf.keras.layers.ZeroPadding2D((1,1)))
model.add(tf.keras.layers.Convolution2D(512, (3, 3), activation='relu'))
model.add(tf.keras.layers.ZeroPadding2D((1,1)))
model.add(tf.keras.layers.Convolution2D(512, (3, 3), activation='relu'))
model.add(tf.keras.layers.ZeroPadding2D((1,1)))
model.add(tf.keras.layers.Convolution2D(512, (3, 3), activation='relu'))
model.add(tf.keras.layers.MaxPooling2D((2,2), strides=(2,2)))
model.add(tf.keras.layers.Convolution2D(4096, (7, 7), activation='relu'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Convolution2D(4096, (1, 1), activation='relu'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Convolution2D(2622, (1, 1)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Activation('softmax'))
return model
In [ ]:
#Now create the VGG model and load the weights
weight_file = f'{project_dir}/vgg_face_weights.h5' # Google Drive path
model = vgg_face()
model.load_weights(weight_file)
In [ ]:
#Create the VGG Face Descriptor model - this model will be used to extract the features from the images (embeddings)
#Therfore we will remove the last layer from the VGG model and use the output of the second last layer as the output of the new model
vgg_face_descriptor = tf.keras.models.Model(inputs=model.layers[0].input, outputs=model.layers[-2].output)
In [ ]:
# Getting embedding vector for first image in the metadata using the pre-trained model
img_path = metadata[0].image_path()
img = load_image(img_path)
# Normalising pixel values from [0-255] to [0-1]: scale RGB values to interval [0,1]
img = (img / 255.).astype(np.float32)
img = cv2.resize(img, dsize = (224,224))
print(img.shape)
# Obtaining embedding vector for an image
# Getting the embedding vector for the above image using vgg_face_descriptor model and printing the shape
embedding_vector = vgg_face_descriptor.predict(np.expand_dims(img, axis=0))[0]
print(embedding_vector.shape)
(224, 224, 3) 1/1 [==============================] - 1s 740ms/step (2622,)
In [ ]:
#Get the embedding vectors for all the images in the metadata
#initialize the embeddings : embeding size per image will be 2622 (as seen from the output above)
embeddings = np.zeros((metadata.shape[0], 2622))
#We will generate embeddings using batch of images
batch_size=1000 #@param {type:"integer"}
total_batches=len(embeddings)//batch_size + (1 if (len(embeddings)%batch_size > 0) else 0)
for idx in range(0, len(embeddings), batch_size):
#Batch image indexe - start and end values
start = idx
end = idx+batch_size if idx+batch_size < len(embeddings) else len(embeddings)
#create the input array of batch size - each input tensor of size [224,224,3]
inputs = np.zeros([end-start,224,224,3])
#get the metadata at for the batch 'start' to 'end'
#and populate the input vector array
pbar=tqdm(metadata[start:end], ascii=True)
pbar.set_description(f'Batch {idx//batch_size+1} of {total_batches}')
for i,m in enumerate(pbar):
img = load_image(m.image_path())
try:
img = load_image(m.image_path())
# scale RGB values to interval [0,1]
img = cv2.resize(img, dsize = (224,224))
img = (img / 255.).astype(np.float32)
#Set the input tensor for the image
inputs[i] = np.expand_dims(img, axis=0)[0]
except Exception as e:
print(str(e))
print(i,m)
#Now set the embeddings for the batch
embeddings[start:end] = vgg_face_descriptor.predict(inputs)
del inputs #delete to release memory
print() #print a newline for the pbar
Batch 1 of 11: 100%|##########| 1000/1000 [00:04<00:00, 230.98it/s]
32/32 [==============================] - 3s 48ms/step
Batch 2 of 11: 100%|##########| 1000/1000 [00:04<00:00, 226.78it/s]
32/32 [==============================] - 1s 29ms/step
Batch 3 of 11: 100%|##########| 1000/1000 [00:04<00:00, 223.87it/s]
32/32 [==============================] - 1s 29ms/step
Batch 4 of 11: 100%|##########| 1000/1000 [00:04<00:00, 226.67it/s]
32/32 [==============================] - 1s 29ms/step
Batch 5 of 11: 100%|##########| 1000/1000 [00:04<00:00, 227.17it/s]
32/32 [==============================] - 1s 29ms/step
Batch 6 of 11: 100%|##########| 1000/1000 [00:04<00:00, 226.87it/s]
32/32 [==============================] - 1s 29ms/step
Batch 7 of 11: 100%|##########| 1000/1000 [00:04<00:00, 227.22it/s]
32/32 [==============================] - 1s 30ms/step
Batch 8 of 11: 100%|##########| 1000/1000 [00:04<00:00, 225.56it/s]
32/32 [==============================] - 1s 29ms/step
Batch 9 of 11: 100%|##########| 1000/1000 [00:04<00:00, 226.01it/s]
32/32 [==============================] - 1s 30ms/step
Batch 10 of 11: 100%|##########| 1000/1000 [00:04<00:00, 225.00it/s]
32/32 [==============================] - 1s 29ms/step
Batch 11 of 11: 100%|##########| 770/770 [00:03<00:00, 225.48it/s]
25/25 [==============================] - 1s 47ms/step
5. Build distance metrics for identifying the distance between two similar and dissimilar images. [4 Marks]¶
In [ ]:
def distance(emb1, emb2):
return np.sum(np.square(emb1 - emb2))
In [ ]:
def show_pair(idx1, idx2):
plt.figure(figsize=(8,3))
plt.suptitle(f'Distance = {distance(embeddings[idx1], embeddings[idx2]):.2f}')
plt.subplot(121)
plt.imshow(load_image(metadata[idx1].image_path()))
plt.title(metadata[idx1].person_name)
plt.subplot(122)
plt.imshow(load_image(metadata[idx2].image_path()));
plt.title(metadata[idx2].person_name)
show_pair(102, 103)
show_pair(102, 480)
6. Use PCA for dimensionality reduction. [2 Marks]¶
In [ ]:
#Create train and test data
#All indexes which is divisible by 9 will be test images, rest are train images. So, 90% images are for training, 10% is for testing
train_idx = np.arange(metadata.shape[0]) % 9 != 0
test_idx = np.arange(metadata.shape[0]) % 9 == 0
# Create X_train and X_test using the above indices
X_train = embeddings[train_idx]
X_test = embeddings[test_idx]
#For each image 'person_name' from the metadata is the target label
targets = np.array([m.person_name for m in metadata])
#Again store the y_train and y_test as per the same indices as above
y_train = targets[train_idx]
y_test = targets[test_idx]
In [ ]:
# Encode the target identities
encoder = LabelEncoder()
y_train = encoder.fit_transform(y_train)
y_test = encoder.transform(y_test)
In [ ]:
#Scale the features using StandardScaler
# Standarize features
sc = StandardScaler()
X_train_sc = sc.fit_transform(X_train)
X_test_sc = sc.transform(X_test)
In [ ]:
# Reduce feature dimensions using Principal Component Analysis
# Covariance matrix
cov_matrix = np.cov(X_train_sc.T)
# Eigen values and vector
eig_vals, eig_vecs = np.linalg.eig(cov_matrix)
# Cumulative variance explained
tot = sum(eig_vals)
var_exp = [(i /tot) * 100 for i in sorted(eig_vals, reverse = True)]
cum_var_exp = np.cumsum(var_exp)
print('Cumulative Variance Explained', cum_var_exp)
Cumulative Variance Explained [ 13.55366322 18.95781462 22.92683789 ... 99.99999983 99.99999999 100. ]
In [ ]:
# Get index where cumulative variance explained is > threshold
thres = 95
res = list(filter(lambda i: i > thres, cum_var_exp))[0]
index = (cum_var_exp.tolist().index(res))
print(f'Index of element just greater than {thres}: {str(index)}')
Index of element just greater than 95: 346
In [ ]:
# Ploting
plt.figure(figsize = (15 , 7.2))
plt.bar(range(1, eig_vals.size + 1), var_exp, alpha = 0.5, align = 'center', label = 'Individual explained variance')
plt.step(range(1, eig_vals.size + 1), cum_var_exp, where = 'mid', label = 'Cumulative explained variance')
plt.axhline(y = thres, color = 'r', linestyle = '--')
plt.axvline(x = index, color = 'r', linestyle = '--')
plt.ylabel('Explained Variance Ratio')
plt.xlabel('Principal Components')
plt.legend(loc = 'best')
plt.tight_layout()
plt.show()
In [ ]:
# Reducing the dimensions
pca = PCA(n_components = index, random_state = 20, svd_solver = 'full', whiten = True)
pca.fit(X_train_sc)
X_train_pca = pca.transform(X_train_sc)
X_test_pca = pca.transform(X_test_sc)
display(X_train_pca.shape, X_test_pca.shape)
(9573, 346)
(1197, 346)
7. Build an SVM classifier in order to map each image to its right person. [4 Marks]¶
In [ ]:
#Try to get the best parameters for SVC using grid search
params_grid = [{'kernel': ['rbf'], 'gamma': [1e-2, 1e-3, 1e-4], 'C': [1, 10, 100, 1000], 'class_weight': ['balanced', None]}]
svc_search = GridSearchCV(SVC(random_state = 20, verbose=True), params_grid, cv = 3, scoring = 'f1_macro', verbose=4)
svc_search.fit(X_train_pca, y_train)
print()
print('Best estimator found by grid search:')
print(svc_search.best_estimator_)
Fitting 3 folds for each of 24 candidates, totalling 72 fits [LibSVM][CV 1/3] END C=1, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.667 total time= 22.3s [LibSVM][CV 2/3] END C=1, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.678 total time= 21.7s [LibSVM][CV 3/3] END C=1, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.683 total time= 22.2s [LibSVM][CV 1/3] END C=1, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.958 total time= 15.6s [LibSVM][CV 2/3] END C=1, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.961 total time= 15.9s [LibSVM][CV 3/3] END C=1, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.956 total time= 15.6s [LibSVM][CV 1/3] END C=1, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.939 total time= 21.3s [LibSVM][CV 2/3] END C=1, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.936 total time= 20.7s [LibSVM][CV 3/3] END C=1, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.940 total time= 21.2s [LibSVM][CV 1/3] END C=1, class_weight=None, gamma=0.01, kernel=rbf;, score=0.649 total time= 21.8s [LibSVM][CV 2/3] END C=1, class_weight=None, gamma=0.01, kernel=rbf;, score=0.654 total time= 22.5s [LibSVM][CV 3/3] END C=1, class_weight=None, gamma=0.01, kernel=rbf;, score=0.658 total time= 22.2s [LibSVM][CV 1/3] END C=1, class_weight=None, gamma=0.001, kernel=rbf;, score=0.955 total time= 15.6s [LibSVM][CV 2/3] END C=1, class_weight=None, gamma=0.001, kernel=rbf;, score=0.959 total time= 15.8s [LibSVM][CV 3/3] END C=1, class_weight=None, gamma=0.001, kernel=rbf;, score=0.953 total time= 15.6s [LibSVM][CV 1/3] END C=1, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.697 total time= 19.7s [LibSVM][CV 2/3] END C=1, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.676 total time= 19.4s [LibSVM][CV 3/3] END C=1, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.683 total time= 19.3s [LibSVM][CV 1/3] END C=10, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.689 total time= 22.1s [LibSVM][CV 2/3] END C=10, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.698 total time= 21.5s [LibSVM][CV 3/3] END C=10, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.699 total time= 21.5s [LibSVM][CV 1/3] END C=10, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.957 total time= 15.4s [LibSVM][CV 2/3] END C=10, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.960 total time= 15.2s [LibSVM][CV 3/3] END C=10, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.955 total time= 15.7s [LibSVM][CV 1/3] END C=10, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.959 total time= 13.6s [LibSVM][CV 2/3] END C=10, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.961 total time= 14.0s [LibSVM][CV 3/3] END C=10, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.955 total time= 13.7s [LibSVM][CV 1/3] END C=10, class_weight=None, gamma=0.01, kernel=rbf;, score=0.689 total time= 21.5s [LibSVM][CV 2/3] END C=10, class_weight=None, gamma=0.01, kernel=rbf;, score=0.698 total time= 21.2s [LibSVM][CV 3/3] END C=10, class_weight=None, gamma=0.01, kernel=rbf;, score=0.699 total time= 21.6s [LibSVM][CV 1/3] END C=10, class_weight=None, gamma=0.001, kernel=rbf;, score=0.958 total time= 15.5s [LibSVM][CV 2/3] END C=10, class_weight=None, gamma=0.001, kernel=rbf;, score=0.960 total time= 15.7s [LibSVM][CV 3/3] END C=10, class_weight=None, gamma=0.001, kernel=rbf;, score=0.956 total time= 15.6s [LibSVM][CV 1/3] END C=10, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.959 total time= 13.7s [LibSVM][CV 2/3] END C=10, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.960 total time= 13.6s [LibSVM][CV 3/3] END C=10, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.955 total time= 13.2s [LibSVM][CV 1/3] END C=100, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.689 total time= 21.5s [LibSVM][CV 2/3] END C=100, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.698 total time= 22.2s [LibSVM][CV 3/3] END C=100, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.699 total time= 21.2s [LibSVM][CV 1/3] END C=100, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.958 total time= 15.6s [LibSVM][CV 2/3] END C=100, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.960 total time= 15.8s [LibSVM][CV 3/3] END C=100, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.955 total time= 15.3s [LibSVM][CV 1/3] END C=100, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.953 total time= 12.9s [LibSVM][CV 2/3] END C=100, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.955 total time= 12.9s [LibSVM][CV 3/3] END C=100, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.952 total time= 12.8s [LibSVM][CV 1/3] END C=100, class_weight=None, gamma=0.01, kernel=rbf;, score=0.689 total time= 21.6s [LibSVM][CV 2/3] END C=100, class_weight=None, gamma=0.01, kernel=rbf;, score=0.698 total time= 22.0s [LibSVM][CV 3/3] END C=100, class_weight=None, gamma=0.01, kernel=rbf;, score=0.699 total time= 21.5s [LibSVM][CV 1/3] END C=100, class_weight=None, gamma=0.001, kernel=rbf;, score=0.958 total time= 15.5s [LibSVM][CV 2/3] END C=100, class_weight=None, gamma=0.001, kernel=rbf;, score=0.960 total time= 16.1s [LibSVM][CV 3/3] END C=100, class_weight=None, gamma=0.001, kernel=rbf;, score=0.956 total time= 15.6s [LibSVM][CV 1/3] END C=100, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.953 total time= 12.2s [LibSVM][CV 2/3] END C=100, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.955 total time= 12.5s [LibSVM][CV 3/3] END C=100, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.952 total time= 12.4s [LibSVM][CV 1/3] END C=1000, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.689 total time= 21.8s [LibSVM][CV 2/3] END C=1000, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.698 total time= 21.8s [LibSVM][CV 3/3] END C=1000, class_weight=balanced, gamma=0.01, kernel=rbf;, score=0.699 total time= 21.5s [LibSVM][CV 1/3] END C=1000, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.958 total time= 15.5s [LibSVM][CV 2/3] END C=1000, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.960 total time= 15.9s [LibSVM][CV 3/3] END C=1000, class_weight=balanced, gamma=0.001, kernel=rbf;, score=0.955 total time= 15.5s [LibSVM][CV 1/3] END C=1000, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.953 total time= 12.4s [LibSVM][CV 2/3] END C=1000, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.955 total time= 12.7s [LibSVM][CV 3/3] END C=1000, class_weight=balanced, gamma=0.0001, kernel=rbf;, score=0.952 total time= 12.4s [LibSVM][CV 1/3] END C=1000, class_weight=None, gamma=0.01, kernel=rbf;, score=0.689 total time= 21.9s [LibSVM][CV 2/3] END C=1000, class_weight=None, gamma=0.01, kernel=rbf;, score=0.698 total time= 21.6s [LibSVM][CV 3/3] END C=1000, class_weight=None, gamma=0.01, kernel=rbf;, score=0.699 total time= 22.0s [LibSVM][CV 1/3] END C=1000, class_weight=None, gamma=0.001, kernel=rbf;, score=0.958 total time= 15.7s [LibSVM][CV 2/3] END C=1000, class_weight=None, gamma=0.001, kernel=rbf;, score=0.960 total time= 15.5s [LibSVM][CV 3/3] END C=1000, class_weight=None, gamma=0.001, kernel=rbf;, score=0.956 total time= 15.3s [LibSVM][CV 1/3] END C=1000, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.953 total time= 12.7s [LibSVM][CV 2/3] END C=1000, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.955 total time= 12.5s [LibSVM][CV 3/3] END C=1000, class_weight=None, gamma=0.0001, kernel=rbf;, score=0.952 total time= 12.7s [LibSVM] Best estimator found by grid search: SVC(C=1, class_weight='balanced', gamma=0.001, random_state=20, verbose=True)
In [ ]:
#Use the best estimator as found above
clf = SVC(kernel='rbf', C=svc_search.best_estimator_.C, class_weight=svc_search.best_estimator_.class_weight, gamma=svc_search.best_estimator_.gamma, probability=True, random_state=20)
print(f'Running SVC with the best parameters as found above : {clf}')
clf.fit(X_train_pca, y_train)
print('SVC accuracy for train set: {0:.3f}'.format(clf.score(X_train_pca, y_train)))
print('SVC accuracy for test set: {0:.3f}'.format(clf.score(X_test_pca, y_test)))
Running SVC with the best parameters as found above : SVC(C=1, class_weight='balanced', gamma=0.001, probability=True,
random_state=20)
SVC accuracy for train set: 0.994
SVC accuracy for test set: 0.967
8. Import and display the the test images. [2 Marks]¶
Hint: ‘Benedict Cumberbatch9.jpg’ and ‘Dwayne Johnson4.jpg’ are the test images.
In [ ]:
# Suppressing LabelEncoder warning
warnings.filterwarnings('ignore')
#Create a function to display an image
def display_image(image_path, title):
example_image = load_image(image_path)
plt.imshow(example_image)
plt.title(title)
Benedict_Cumberbatch9 = f'{project_dir}/Benedict Cumberbatch9.jpg'
Dwayne_Johnson4 = f'{project_dir}/Dwayne Johnson4.jpg' # Google Drive path
display_image(Benedict_Cumberbatch9, "Benedict Cumberbatch")
In [ ]:
display_image(Dwayne_Johnson4, "Dwayne Johnson")
9. Use the trained SVM model to predict the face on both test images. [4 Marks]¶
In [ ]:
def identify(image_path):
test_image = load_image(image_path)
# scale RGB values to interval [0,1]
img = cv2.resize(test_image, dsize = (224,224))
img = (img / 255.).astype(np.float32)
#Set the input tensor for the image
input = np.expand_dims(img, axis=0)
test_image_embeddings = vgg_face_descriptor.predict(input)[0]
#Scale the features using our standard scaler defined earlier
X_test_new_sc = sc.transform([test_image_embeddings])
#Reduce feature dimensions using Principal Component Analysis defined earlier
X_test_new_pca = pca.transform(X_test_new_sc)
#Predict the identity using the SVC model
prediction = clf.predict(X_test_new_pca)
prediction_probability = clf.predict_proba(X_test_new_pca)
#Get the identity name from the encoder (reverse transform)
identity = encoder.inverse_transform(prediction)[0]
#Display the image with the identified name
plt.imshow(test_image)
plt.title(f'Identified as "{identity}"\n(Match Probability: {prediction_probability[0][prediction][0]:.2f})')
identify(Benedict_Cumberbatch9)
1/1 [==============================] - 0s 24ms/step
In [ ]:
identify(Dwayne_Johnson4)
1/1 [==============================] - 0s 22ms/step